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1.  EXECUTIVE  SUMMARY 


Airborne  mobile  ad  hoc  network  (MANET)  environments  require  a  quick  and  lightweight 
method  for  authentication.  These  constantly  changing  airborne  networks  (AN)  need  some  way 
to  identify  trustworthy  users  without  a  third-party  involved.  One  possible  answer  to  these 
requirements  is  the  zero-knowledge  proof  method.  Zero-knowledge  proof  systems  provide  an 
interactive  approach  for  an  entity  to  prove  the  possession  of  private  knowledge  without  revealing 
any  information  about  it.  Successful  challenge/response  interactions  between  a  Prover  and 
Verifier  provide  a  confidence  level  of  trust  to  the  Verifier  that  the  Prover  indeed  possesses  the 
private  information.  The  fact  that  the  private  knowledge  is  never  revealed  provides  benefits 
towards  achieving  a  protocol  that  is  secure  against  eavesdroppers.  The  desirable  characteristics 
in  a  zero-knowledge  proof  system  for  airborne  MANETs  are  (1)  low  amount  of  information  (i.e. 
bits  per  transaction)  transferred  between  parties,  (2)  low  number  of  iterations  of  the  protocol 
needed  to  establish  trust,  and  (3)  low  probability  that  an  untrustworthy  party  is  able  to  establish 
trust.  Characteristics  (1)  and  (2)  provide  a  lightweight  protocol,  while  characteristic  (3)  ensures 
that  the  protocol  is  strong. 

Since  zero-knowledge  proof  systems  require  a  verifier  to  check  that  the  information 
received  from  the  prover  exhibits  knowledge  of  the  private  input,  the  base  problem  (for  which 
the  private  input  is  the  solution)  must  be  easily  verifiable.  However,  for  the  protocol  to  be  hard 
to  cheat,  we  must  have  a  base  problem  that  is  difficult  to  solve  from  scratch.  This  leads  us  to 
consider  base  problems  that  fall  in  the  class  of  NP-complete  problems  -  computationally 
expensive  decision  problems  in  which  a  positive  solution  can  be  checked  in  polynomial-time. 
This  report  investigates  the  graph  theory  subset  of  the  class  of  NP-complete  problems  and  their 
use  as  base  problems  for  zero-knowledge  proof  systems.  In  particular,  the  problems  examined 
most  in-depth  are  all  related  to  either  the  sub-graph  isomorphism  problem  or  the  graph  coloring 
problem. 

In  this  paper,  several  approaches  are  formulated  into  a  zero-knowledge  proof  system,  and 
their  characteristics  are  examined.  Examples  of  the  following  graph  problems  are  given:  sub¬ 
graph  isomorphism  graph  isomorphism,  independent  set,  longest  path  problem,  Hamiltonian 
cycle  problem,  graph  3-coloring,  equitable  3-coloring,  satisfiability,  and  graph  partitioning. 

Considering  the  problem  classes  discussed,  the  least  promising  problem  is  the 
satisfiability  problem,  due  to  very  efficient  algorithms  that  are  able  to  solve  enormous  problem 
instances  very  quickly.  The  most  promising  group  appears  to  be  the  sub-graph  isomorphism 
class.  The  zero-knowledge  proof  systems  associated  with  this  class  of  problems  are  relatively 
lightweight  in  comparison  with  the  other  problem  classes,  and  several  of  the  problems  in  the 
class  have  many  difficult  instances  and  few  efficient  algorithms.  The  protocols  have  average 
strength  in  terms  of  the  proof  systems  for  graph-based  problems.  In  comparison,  the  graph 
coloring  class  has  many  difficult  instances  for  the  problems,  but  the  existing  zero-knowledge 
proof  systems  are  relatively  easy  to  cheat.  This  then  implies  that  the  proof  systems  are  not  as 
strong  as  those  in  the  sub-graph  isomorphism  class. 
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2.  INTRODUCTION 


In  an  airborne  networking  (AN)  environment,  the  mobility  of  the  network  users  neeessitates  an 
agile  authentieation  system.  Zero-knowledge  proof  systems  allow  an  interaetion  between  parties 
to  determine  trustworthiness  in  a  quiek  and  effeetive  manner.  In  order  to  make  these  interaetions 
as  fast  and  seeure  as  possible,  they  are  most  often  based  on  problems  from  the  NP-eomplete 
elass,  which  contains  many  graph  theory  problems.  A  strong  and  lightweight  zero-knowledge 
protocol  must  satisfy  the  following  criterion:  it  must  have  a  small  number  of  bits  transferred 
between  parties,  it  must  require  few  iterations  to  achieve  a  given  trust  level,  and  it  must  be 
difficult  for  a  cheater  to  pass  as  trustworthy. 

This  report  is  outlined  as  follows.  Section  2  continues  to  provide  the  necessary 
background  information  in  graph  theory,  complexity  theory,  and  zero-knowledge  proof  systems. 
Sections  3  through  5  discuss  individual  problems  that  zero-knowledge  proof  systems  can  be 
based  on.  Section  6  presents  our  conclusions  and  future  work.  Section  7  lists  the  relevant 
references,  and  the  appendix  expands  upon  that  list  to  provide  an  annotated  bibliography. 

2.1  Graph  Theory  Background 

This  section  is  meant  as  a  guide  to  some  of  the  graph  theoretic  terms  and  concepts  employed  in 
this  report.  For  a  more  extensive  reference,  it  is  recommended  that  the  reader  consult  a  textbook 
such  as  Diestel’s  Graph  Theory  (Diestel  2006). 

A  graph  is  a  pair  G  =  (V,  E)  such  that  F  is  a  subset  of  F  x  F,  where  F  is  the  set  of 
vertices  and  E  is  the  set  of  edges  in  the  graph.  Vertices  can  also  be  called  nodes.  An  edge  is 
incident  to  a  vertex  if  the  vertex  is  one  of  the  edge’s  endpoints.  Two  vertices  are  adjacent  (also 
called  neighbors)  if  they  are  connected  by  an  edge.  The  degree  of  a  vertex  (or  valency)  is  the 
number  of  edges  incident  to  it.  An  adjacency  matrix  representation  of  a  graph  is  a  matrix  in 
which  the  rows  and  columns  represent  the  vertices  and  an  entry  equal  to  1  in  row  u  and  column 
V  implies  the  existence  of  an  edge  between  vertices  u  and  v,  while  an  entry  equal  to  0  implies 
that  there  is  no  edge  between  u  and  v. 

In  discussing  graphs,  we  use  the  following  terms.  A  simple  graph  is  a  graph  in  which 
there  is  at  most  one  edge  between  distinct  vertices  and  there  are  no  edges  from  a  vertex  to  itself 
(called  a  loop).  We  will  only  deal  with  simple  graphs  in  this  report.  The  complement  G  = 

(F,  E)  of  a  graph  G  —  (F,  E)  is  the  graph  in  which  F  =  F  and  (x,  y)  is  an  edge  in  G  if  and  only 
if  it  is  a  non-edge  in  G,  i.e.  E  =  {(x,  y)  ^  E\  x,y  E  F).  A  regular  graph  is  a  graph  in  which 
every  vertex  has  the  same  degree.  A  labeled  graph  is  a  graph  with  labels  (distinct  or  not  distinct) 
placed  on  the  vertices  or  the  edges.  A  weighted  graph  has  edge  labels  that  denote  a  weight  on  an 
edge.  These  weights  could  represent  distance  or  some  other  measure.  An  unweighted  graph  has 
no  labels  on  the  edges.  This  could  also  be  defined  as  a  graph  with  all  edge  weights  equal  to  1 . 
The  empty  graph  is  the  graph  G  =  (F,  E)  with  E  —  0.  A  complete  graph  is  a  graph  in  which 
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every  edge  possible  is  present,  i.e.  for  every  pair  of  distinct  vertices  x,y  E  V,  there  is  an  edge 
(x,y)  G  E. 

There  are  many  terms  for  describing  the  structures  present  within  a  graph.  A  sub-graph 
G'  —  (y' ,  E')  of  a  graph  G  —  (V,  E)  is  a  graph  in  which  V'  and  E'  ^  E.  We  denote  that  G'  is 
a  sub-graph  of  G  by  writing  G'  ^  G.  An  induced  sub-graph  G'  =  (V',  E')  of  a  graph  G  —  (V,  E) 
is  a  sub-graph  of  G  in  which  E'  =  {(x,  y)  E  E:  x,y  E  V}.  To  indicate  an  induced  sub-graph,  we 
write  G'  =  G[V'].  A  path  is  a  sequence  of  vertices  and  edges  such  that  no  vertices  and  no  edges 
are  repeated.  A  cycle  is  a  path  with  the  exception  that  the  first  and  last  vertices  are  the  same.  A 
Hamiltonian  cycle  or  path  is  a  cycle  or  path  that  travels  through  every  vertex  in  the  graph.  The 
length  of  a  path  or  cycle  is  the  number  of  edges.  An  independent  set  is  a  set  5  ^  K  in  a  graph 
G  —  (y,  E)  such  that  the  edge  set  of  G  [5]  is  an  empty  set.  A  clique  is  the  complement  of  an 
independent  set,  i.e.  a  sub-graph  G'  =  (V'.E^  oi  G  =  (V,  E)  such  that  E'  —  {(x,  y):  x,  y  G  V). 

Much  of  this  report  utilizes  the  concept  of  a  graph  isomorphism.  Two  graphs  G  =  (V,  E) 
and  G'  =  (Y'.E')  are  isomorphic  if  there  exists  a  bijective  function  /:  K(G)  K(G')  such  that 
(x,y)  G  £■  if  and  only  if  (/(x),/(y))  G  E'.  Less  formally,  two  graphs  are  isomorphic  if  they 
exhibit  the  same  structures.  A  sub-graph  isomorphism  is  an  isomorphism  from  a  graph  G  to  a 
sub-graph  H  Q  G'. 

2.2  NP-Completeness 

This  section  is  an  introduction  to  some  of  the  theory  of  NP-completeness  and  complexity  theory 
that  is  utilized  in  this  report.  For  a  more  extensive  reference,  it  is  recommended  that  the  reader 
consult  a  textbook  such  as  Skiena’s  Algorithm  Design  Manual  (Skiena  2008). 

The  complexity  classes  involved  in  this  report  are  primarily  the  classes  P  (Polynomial¬ 
time)  and  NP  (Non-deterministic  Polynomial).  Because  we  will  not  discuss  Turing  machines  in 
this  report,  we  will  state  the  somewhat  less  formal  definitions  of  the  complexity  classes.  The 
class  NP  is  the  class  of  decision  problems  for  which  any  yes-instance  has  a  solution  that  is 
verifiable  in  polynomial  time.  The  class  P  contains  all  decision  problems  that  can  be  solved  in 
polynomial  time,  and  hence  also  have  solutions  that  can  be  verified  in  polynomial  time,  implying 
that  P  c  NP. 

A  problem  L  in  the  class  NP  is  in  the  subclass  of  NP-complete  problems  if  every  problem 
in  NP  can  be  reduced  to  the  problem  L  in  polynomial  time.  A  reduction  from  problem  K  to 
problem  L  is  an  algorithm  which  takes  as  input  an  arbitrary  instance  of  problem  K  and  outputs  an 
instance  of  problem  L.  Given  this  definition,  it  is  clear  that  the  class  of  NP-complete  problems 
contains  the  hardest  problems  in  the  class  NP,  as  an  easy  solution  for  one  NP-complete  problem 
leads  to  an  easy  solution  for  all  problems  in  the  class  NP. 

Because  of  the  work  published  by  Cook  (Cook  1971),  which  proves  that  satisfiability  is 
the  first  NP-complete  problem,  proving  an  NP-complete  problem  is  somewhat  easier  than  the 
definition  implies.  As  Cook  proved  that  satisfiability  is  reducible  to  any  problem  in  NP,  in  order 
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to  prove  a  problem  is  in  the  class  of  NP-complete  problems,  we  need  only  prove  that  a  known 
NP-complete  problem  L  reduces  to  our  problem.  Then  through  Cook’s  theorem  and  following 
the  chain  of  reductions  from  satisfiability  to  L,  we  have  shown  that  every  problem  in  the  class 
NP  reduces  to  our  problem. 

2.3  Zero-Knowledge  Proof  Systems 

This  section  is  an  introductory  guide  to  some  of  the  theory  and  concepts  of  zero-knowledge 
proof  systems  that  are  used  in  this  report.  For  a  more  extensive  reference,  it  is  recommended 
that  the  reader  consult  a  textbook  such  as  Simmons’s  Contemporary  Cryptology  (Simmons 
1992). 

We  begin  with  the  notion  of  an  interactive  proof  system.  An  interactive  proof  system  is 
an  interaction  between  two  participants,  called  the  prover  and  the  verifier,  in  which  the  prover 
attempts  to  prove  some  fact  (or  knowledge  of  some  private  input)  to  the  verifier.  An  interactive 
proof  system  is  formally  defined  as  a  protocol  based  on  a  decision  problem  which  satisfies  the 
following  properties: 

Completeness:  Each  yes-instance  of  the  decision  problem  leads  to  acceptance  by  the 
verifier  with  probability  at  least  1  —  n~^  for  any  constant  k  >  0,  where  n  is  the 
size  of  the  problem  instance. 

Soundness:  Each  no-instance  of  the  decision  problem  leads  to  rejection  by  the  verifier 
with  probability  at  least  1  —  n~^  for  any  prover  (honest  or  cheating). 

A  zero-knowledge  proof  system  is  an  interactive  proof  system  with  an  additional 
requirement:  the  zero -knowledge  property  must  be  satisfied.  The  zero-knowledge  property 
ensures  that  the  verifier  cannot  gain  any  information  from  the  interaction  with  the  prover  that 
could  not  have  been  determined  alone.  This  also  guarantees  that  any  eavesdropper  cannot  gain 
knowledge  by  listening  to  the  conversation.  In  a  zero-knowledge  proof  system,  the  interaction 
begins  with  the  prover  presenting  a  commitment  to  some  information  about  the  graph,  which  is 
followed  by  a  challenge  from  the  verifier.  The  challenge  consists  of  a  request  for  specific 
information  about  the  problem  instance.  To  prove  that  the  zero-knowledge  property  is  satisfied 
in  such  an  interaction,  we  use  simulators.  This  proof  method  of  the  zero-knowledge  property  is 
structured  as  follows: 

1 .  Verifier  simulates  the  prover 

•  Given  the  set  of  possible  challenges,  the  verifier  randomly  and  uniformly 
decides  which  challenge  to  commit  to  a  correct  response  to. 

2.  Verifier  simulates  the  verifier 

•  Using  a  probabilistic  algorithm,  the  verifier  decides  which  challenge  to 
send. 
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•  If  the  challenge  does  not  match  what  was  committed  to  in  step  1,  the 
verifier  backs  up  the  algorithm  to  the  state  it  was  in  at  the  beginning  of 
step  1  and  starts  the  simulation  over. 

If  the  process  above  generates  a  conversation  that  is  the  same  as  one  that  could  have  been 
generated  with  an  honest  prover  (one  that  is  different  from  the  verifier),  then  the  zero-knowledge 
property  is  satisfied. 

An  easily  understood  example  of  a  zero-knowledge  proof  system  illustrates  that  the 
prover  knows  how  to  solve  a  Rubik’s  cube  (the  algorithm  is  the  private  input).  The  verifier 
scrambles  a  Rubik’s  cube  and  hands  it  to  the  prover.  The  prover  turns  away  so  that  the  verifier  is 
unable  to  see  the  cube,  and  then  attempts  to  solve  the  puzzle.  If  the  prover  knows  the  algorithm, 
this  will  be  an  easy  task  and  the  prover  will  quickly  hand  a  solved  Rubik’s  cube  back  to  the 
verifier.  If  the  prover  does  not  know  the  algorithm,  the  prover  may  or  may  not  be  able  to  solve 
the  puzzle.  Most  likely,  a  prover  that  does  not  know  how  to  solve  a  Rubik’s  cube  will  not  easily 
solve  the  puzzle  quickly  several  times  in  a  row,  and  so  the  verifier  will  (eventually)  see  that  the 
prover  does  not  know  the  algorithm.  A  prover  that  does  know  the  algorithm  will  quickly  solve 
the  cube  as  many  times  as  the  verifier  wishes. 

An  important  component  of  a  zero-knowledge  proof  system  is  the  commitment.  Zero- 
knowledge  proof  systems  usually  require  that  the  prover  has  some  method  of  “locking  up” 
information  about  the  problem  instance  prior  to  receiving  the  verifier’s  challenge.  Otherwise, 
the  prover  would  be  able  to  manufacture  a  response  to  the  verifier’s  challenge,  and  this  new 
response  may  or  may  not  be  consistent  with  the  problem  instance.  For  example,  in  the  Rubik’s 
cube  zero-knowledge  proof  system  discussed  in  the  last  paragraph,  the  verifier  may  require  that 
the  prover  remain  in  the  same  room  so  that  the  verifier  can  be  sure  that  the  prover  returns  the 
same  Rubik’s  cube  given  out  in  the  beginning.  The  structure  of  a  zero-knowledge  proof  system 
is  such  that  a  response  to  any  one  challenge  does  not  reveal  the  private  input,  but  responding  to 
all  challenges  reveals  the  private  input.  Hence  the  prover  must  be  able  to  simultaneously  commit 
to  all  challenge  responses,  but  reveal  each  one  individually. 

There  are  several  different  methods  for  the  prover  to  commit  by.  The  simplest  to 
understand  (but  least  practical  to  implement)  is  to  use  locked  boxes.  The  prover  breaks  up  the 
graph  into  pieces,  which  are  stored  in  locked  boxes.  Then  the  prover  determines  which  boxes  to 
open  by  observing  the  verifier’s  challenge.  Note  that  the  problem  type  will  determine  how  the 
graph  is  broken  up  and  stored.  Another  method  of  commitment  is  through  encryption  by  keys. 

If  the  prover  is  able  to  generate  keys,  then  the  prover  may  encrypt  the  graph  and  send  the 
encrypted  copy  to  the  verifier.  Then  depending  on  the  verifier’s  challenge,  the  prover  sends  keys 
for  decrypting  the  information  necessary  to  answer  the  challenge.  The  methods  used  vary  and  is 
usually  determined  by  the  implementation  of  the  proof  system. 
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3.  METHODS,  ASSUMPTIONS  AND  PROCEDURES 

3.1  SUB  GRAPH  ISOMORPHISM  CLASS 

The  sub-graph  isomorphism  class  contains  many  NP-complete  problems  that  can  be  obtained  by 
a  reduction  from  the  sub-graph  isomorphism  problem.  Figure  1  illustrates  the  structure  of  the 
sub-graph  isomorphism  problem  and  its  related  subproblems.  Note  that  the  chart  lists  the  most 
general  problem  at  the  top  (sub-graph  isomorphism)  and  each  subproblem  allows  more  problem 
restrictions  than  the  superproblem  does.  For  example,  graph  isomorphism  is  a  more  specific 
instance  of  sub-graph  isomorphism  in  that  it  requires  that  H  =  G2.  Thus  if  a  problem  is  NP- 


Figure  1:  The  subproblem  structure  of  the  sub-graph  isomorphism  problem 


complete,  all  problems  that  contain  it  as  a  subproblem  must  also  be  NP-complete,  but  not  vice 
versa. 

3.1.1  Sub-graph  Isomorphism  Problem 

The  general  sub-graph  isomorphism  problem  is  stated  as  follows:  Given  two  graphs  and  G2, 
is  there  a  sub-graph  H  of  G2  such  that  G^  is  isomorphic  to  HI  The  sub-graph  isomorphism 
problem  (SGI)  is  an  NP-complete  problem  (Garey  and  Johnson  1979),  and  has  many  well-known 
subproblems  associated  with  it. 


3.1. 1.1  Algorithms 

In  the  world  of  NP-complete  problems,  there  are  two  ways  to  define  what  makes  a  “good” 
algorithm.  The  first  is  a  theoretical  definition,  in  which  the  computational  complexity  of  the 
problem  is  reduced  for  either  the  general  class  of  all  instances  or  for  a  specific  class  of 
subproblems.  The  second  is  an  experimental  definition.  Using  different  methods  of  attacking 
the  problem,  we  try  to  create  an  algorithm  that  will  solve  most  instances  of  the  problem  in  a  short 
amount  of  time,  but  may  not  be  very  efficient  in  some  rarely-occurring  worst-case  instance. 
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With  many  useful  applications,  there  is  a  significant  amount  of  research  being  done  on  solving 
the  sub-graph  isomorphism  problem  using  both  approaches. 


Theoretical  Results: 

Algorithms  classified  as  theoretical  results  are  aimed  at  lowering  the  computational  complexity 
bounds  that  currently  exist  for  solving  the  sub-graph  isomorphism  problem.  These  algorithms 
generally  are  not  implemented  or  tested  against  each  other  on  actual  instances  of  the  problem. 
Since  the  general  sub-graph  isomorphism  problem  is  known  to  be  NP-complete,  these  algorithms 
tend  to  restrict  the  problem  in  some  manner  in  order  to  make  the  problem  easier  to  solve. 

Many  algorithms  exist  for  solving  the  sub-graph  isomorphism  problem  on  specific 
classes  of  graphs.  For  example,  when  considering  the  class  of  planar  graphs  we  can  reduce  the 
running  time  significantly.  Using  a  dynamic  programming  method,  Dorn  has  developed  an 
algorithm  with  running  time  on  the  order  of  where  the  graph  H  has  k  nodes  and  G  has  n 

nodes  (Dorn  2009).  This  implies  that  if  we  consider  restricting  the  problem  so  that  the  number 
of  nodes  in  the  sub-graph  H  is  fixed,  then  becomes  a  constant  and  hence  the  algorithm  is 
linear. 

Another  restriction  on  the  set  of  graphs  that  has  seen  good  results  is  to  consider  only 
graphs  of  bounded  tree-width.  To  define  tree-width,  we  need  a  few  other  definitions  first  (Alon, 
Yuster  and  Zwick  1995). 

Definition  1:  Let  G  =  {V,  E)  be  a  graph.  A  tree-decomposition  of  G  is  a  pair 

aXi:iEl},T  =  0,F)) 

where  T  is  a  tree  and  (Xj  :  i  G  /}  is  a  collection  of  subsets  of  V  such  that: 

1.  UXi  =  V, 

2.  V(i;,  w)  E  E,3i  E  I  such  that  {v,  w]  ^  Xj,  and 

3.  Vv  EV,  when  we  restrict  T  to  the  vertex  set  (i  G  /  :  u  G  xJ,  we  still  have 

a  connected  tree. 

Definition  2:  Let  G  =  (V,  £)  be  a  graph.  Let  the  set  of  all  tree-decompositions  of  G  be 

denoted  TD(G).  The  tree-width  of  G  is: 

TW(G)  =  min  max  Ixd  —  1. 

TETD(G)  iel 

One  of  the  most  recently  published  randomized  algorithms  for  solving  the  sub-graph 
isomorphism  problem  is  shown  to  have  running  time  (9*(2^n^'^)  when  the  tree-width  of  H  is  at 
most  t  (Fomin,  et  al.  2009)^. 


*  “0*0  notation  hides  factors  polynomial  in  the  instance  size  n  and  the  parameter  k”  -  (Fomin,  et  al.  2009) 
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While  theoretical  results  are  useful  in  determining  what  is  possible  and  impossible  in 
terms  of  creating  new  algorithms  for  solving  the  sub-graph  isomorphism  problem,  these  methods 
are  not  always  practical  or  useful  for  implementation  and  applications.  The  algorithm  may 
appear  fast  in  terms  of  Bachmann-Landau  notation  (Big-Oh  notation),  but  this  can  often  be 
misleading.  For  example,  if  an  algorithm  has  computational  complexity  (5(n),  it  is  possible  that 
the  exact  running  time  has  some  enormous  constant  term,  say  .  In  this  case,  even  though 

the  algorithm  is  linear,  a  running  time  of  n  is  going  to  be  very  costly.  For  solving 

specific  instances  of  the  problem,  it  may  be  more  efficient  to  consider  an  algorithm  that  has 
worse  computational  complexity  in  a  worst  case  scenario,  but  good  experimental  results  on  large 
databases  of  graphs. 


Experimental  Results: 

In  terms  of  algorithms  that  are  practical  to  use,  it  is  necessary  to  review  the  results  obtained  by 
implementing  the  algorithm  and  testing  it  on  several  databases  of  graphs.  While  it  is  not  possible 
to  test  the  algorithm  on  every  possible  graph,  we  can  often  get  a  good  idea  of  how  useful  an 
algorithm  will  be  by  running  it  on  specific  classes  of  graphs  and  instances  that  are  known  to  be 
difficult.  The  algorithms  discussed  in  this  section  currently  appear  to  be  the  most  popular  for 
comparing  new  algorithms  against,  and  so  are  generally  understood  to  be  the  fastest  algorithms 
currently  available.  Also  included  are  several  new  algorithms  that  appear  to  perform  quite  well 
against  the  existing  front-runners. 

VF2  and  Tillman’s  Algorithm  are  the  most  popular  choices  for  efficient  sub-graph 
isomorphism  solvers.  Published  this  year  (2010)  are  two  different  filtering  algorithms  that  seem 
promising.  Filtering  algorithms  aim  to  reduce  the  number  of  possible  target  vertices  in  the  larger 
graph  for  each  vertex  in  the  smaller  graph  to  be  mapped  to  under  an  isomorphism.  By  repeatedly 
reducing  the  set  of  target  vertices,  the  filtering  aims  to  eventually  obtain  a  target  set  of  size  one, 
in  which  case  the  mapping  is  clear  (Solnon  2010). 

Tillman’s  algorithm,  published  in  1976,  is  surprisingly  still  a  popular  and  fast  sub-graph 
isomorphism  solver.  This  algorithm  uses  a  backtracking  method  to  solve  the  problem  in  an 
efficient  manner,  in  most  cases  (Tlllmann  1976).  However,  it  is  often  costly  and  outperformed 
by  newer  methods  when  it  comes  to  larger  instances. 

Arguably  the  best  solver  available,  VF2  (also  referred  to  as  VFLib)  is  an  algorithm  for 
solving  both  the  graph  and  sub-graph  isomorphism  problems.  By  defining  certain  feasibility 
rules,  VF2  is  able  to  reduce  the  number  of  possible  options  and  repeatedly  extend  a  partial 
matching  until  the  correct  sub-graph  is  found  (Cordelia,  et  al.  2004). 

The  filtering  method  ILF  (Iterative  Labeling  Filtering)  begins  with  an  initial  labeling  of 
the  vertices  in  both  graphs  by  some  invariant  property  such  as  vertex  degree.  An  invariant 
property  of  a  graph  is  one  that  remains  constant  under  isomorphisms.  From  this  it  is 
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immediately  clear  that  two  vertices  with  different  labels  cannot  possibly  be  mapped  to  one 
another.  The  algorithm  then  expands  the  lists  as  multisets  (sets  with  repetition  allowed)  by 
adding  the  labels  of  adjacent  nodes.  This  process  is  repeated  as  many  times  as  desired.  At  each 
step,  the  algorithm  uses  an  auxiliary  bipartite  graph  to  determine  the  compatibility  of  two 
vertices  (Zampelli,  Deville  and  Solnon  2010  (to  appear)). 

A  new  algorithm,  AllDifferent-Based  Filtering,  introduced  by  Christine  Solnon  (Solnon 
2010)  uses  a  method  known  as  “local  all  different”  (LAD).  For  each  vertex  u  in  our  sub-graph 
H,  and  for  each  possible  target  vertex  v  in  G  for  u,  the  algorithm  constructs  a  bipartite  graph  with 
vertex  set  (N(m),  N(v)),  where  N(m)  is  the  set  of  all  vertices  that  are  adjacent  to  u  in  H.  The  edges 
in  this  bipartite  graph  are  of  the  form  (u  v  where  v  ’  is  a  possible  target  vertex  for  u  Then  the 
algorithm  searches  for  a  matching  in  this  bipartite  graph  (an  independent  set  of  edges)  that 
covers  all  of  N(m).  If  no  such  matching  exists,  v  is  no  longer  considered  as  a  possible  target 
for  u. 


An  alternative  approach  to  the  problem  is  to  formulate  it  in  a  way  that  takes  advantage  of 
efficient  solvers  from  other  problems  and  areas.  One  such  approach  formulates  the  sub-graph 
isomorphism  problem  as  an  integer  linear  program  (LeBodic,  et  al.  2009).  We  define  the 
variables  of  this  linear  program  as  follows.  We  define  G  =  (N,  L)  and  H  =  (V,  E). 


For  all  pairs  of  vertices  (i,  k)  EV  x  N,  define  Xj 

For  all  pairs  of  edges  (y,  kl)  e  E  x  L.  define  y,,„  =  {J; 


if  i  k 
otherwise 
if  ij  kl 
otherwise 


We  define  the  constraints  as  follows: 


•  Every  vertex  of  H  maps  to  a  unique  vertex  of  G 


(1) 


•  Every  edge  of  H  maps  to  a  unique  edge  of  G 

Zkieiyij.ki  ^ 

•  Every  vertex  of  G  is  targeted  by  at  most  one  vertex  of  H 

<1,  V/cGiV 

•  If  i  1-^  /c  then  any  edge  starting  in  i  maps  to  an  edge  starting  with  k 

I.kleLyij,kl  =  ^i.k.  V/c  G  N.Vij  G  E 

•  If  7  1-^  /  then  any  edge  ending  in  j  maps  to  an  edge  ending  in  I 
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(2) 

(3) 

(4) 


IkleiyijM  =  Xj,i,  VI  G  N,Vij  G  E 


(5) 


Combining  these,  it  is  elear  that  if  we  find  a  solution  to  this  set  of  constraints  then  we  have  a 
graph-sub-graph  isomorphism.  Hence,  the  objective  function  of  this  integer  linear  program  is 
irrelevant.  If  there  is  some  additional  information  (for  example,  if  the  problem  is  geometric), 
then  there  may  be  a  useful  objective  function  (such  as  minimizing  the  distances  between  nodes). 
However  if  not,  we  can  set  the  objective  function  to  be  some  irrelevant  constant  function.  This 
gives  us  the  general  linear  program: 


Max 

EieV  0 

Subject  to 

Yik€N  ^i,k  ~ 

Vi  G  F 

T.klELyij,kl  = 

Vi]  G  E 

Vk  E  N 

Yikleiyij.kl  ~~ 

^i,ki 

Vk  G  N,Vi]  G  E 

T.klELyij,kl  =  ^j,l’ 

VI  G  N,Vi]  G  E 

and 

^i,k’yij,kl  ^ 

V(i,/c)  G  F  X  iV 

{0,1}, 

V(i],  kl)  E  E  X 

L 


Figure  2:  Integer  Linear  Program  for  SGI 

Formulating  the  linear  program  is  now  done,  and  all  that  is  needed  is  an  efficient  solver 
to  work  on  it.  The  testing  done  in  (LeBodic,  et  al.  2009)  is  on  very  application-specific  instances 
(architectural  floor  plans)  making  it  difficult  to  evaluate  in  terms  of  the  more  general  instances 
seen  in  the  graph  databases. 

Testing  of  Experimental  Algorithms: 

While  most  newly  published  algorithms  are  compared  against  either  or  both  of  Tillman’s 
Algorithm  and  VF2,  most  are  not  compared  to  any  other  recently  developed  methods.  One 
difficulty  in  comparing  the  algorithms  is  that  only  recently  has  it  become  popular  to  use  publicly 
available  graph  databases  for  testing  in  a  consistent  manner,  making  it  difficult  to  evaluate  and 
interpret  the  results  that  are  reported. 

The  two  main  databases  that  are  available  for  testing  sub-graph  isomorphism  algorithms 
are  the  GraphBase  database  and  the  VFLib  database.  Several  authors  also  have  created  their  own 
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classes  of  graphs  for  testing,  sueh  as  the  seale-free  networks  database  ereated  in  (Zampelli, 
Deville  and  Solnon  2010  (to  appear)).  The  Stanford  GraptiBase  database  provides  generators  to 
create  various  different  classes  of  graphs  and  is  available  free  (Knuth  1993).  The  VFLib  Graph 
Matching  Library  was  created  specifieally  for  the  graph  isomorphism  and  sub-graph 
isomorphism  problems  and  is  also  available  free  of  eharge  (P.  Foggia  2001). 

On  most  large  instances  of  the  sub-graph  isomorphism  problem,  VF2  outperforms 
Ullman’s  Algorithm  (see,  for  example.  Figure  15  in  (Lipets,  Vanetik  and  Gudes  2009).  For  this 
reason,  many  authors  compare  their  algorithm  only  with  VF2.  One  analysis  and  comparison  of 
VF2,  ILF,  and  LAD,  can  be  found  in  (Solnon  2010).  This  paper  shows  that  LAD  usually 
outperforms  both  VF2  and  ILF  when  run  on  both  the  GraphBase  database  and  the  VFLib 
database.  However,  one  thing  that  is  clear  is  that  no  solver  performs  best  in  every  case. 


3.1. 1.2  Existing  Zero-Knowledge  Proofs 

The  zero-knowledge  proof  systems  for  the  sub-graph  isomorphism  problem  take  as  publie  input 
two  graphs,  Gi  and  G2,  and  as  private  input  a  sub-graph  isomorphism  0:  H,  where  H  is  a 

sub-graph  of  G2.  The  simplest  zero-knowledge  proof  system  for  sub-graph  isomorphism,  ZKPl, 
is  illustrated  in  Figure  3.  The  prover  (P)  randomly  permutes  the  graph  G2  to  obtain  an 
isomorphie  graph  71(^2)  =  G'2  and  then  sends  G'2  to  the  verifier  (V).  V  then  ehooses  a  random 
bit  c,  whieh  is  sent  to  P.  If  c  =  0,  P  sends  tt  to  V  and  V  cheeks  that  G'2  was  formed  eorreetly 
from  G2  and  n.  If  c  =  1,  then  P  sends  n<p  to  V  and  V  ehecks  that  7r(0(Gi))  is  a  sub-graph  of 
G'2  and  is  isomorphic  to  Gi.  Depending  on  the  P’s  response  to  the  challenge,  V  will  deeide 
whether  to  aecept  P  and  continue  in  another  iteration  of  the  protoeol  or  to  reject  P  and  stop 
communication. 

Both  isomorphisms  (0  and  n)  are  private  in  the  beginning,  but  if  the  verifier  chooses 
c  =  0  the  entire  isomorphism  n  is  revealed,  whereas  if  the  verifier  chooses  c  =  1  only  parts  of 
the  isomorphism  n  are  revealed.  In  neither  case  is  any  of  the  isomorphism  0  revealed  to  the 
verifier.  If  c  =  0,  the  verifier  learns  the  isomorphism  n,  but  is  unaware  of  any  information 
regarding  the  sub-graph’s  location.  If  c  =  1,  the  verifier  learns  information  about  the  strueture 
of  the  sub-graph,  but  does  not  know  the  isomorphism  n  and  henee  knows  nothing  about  the 
loeation  of  the  sub-graph. 

It  is  also  possible  to  eonstruct  a  similar  zero-knowledge  proof  system  that  involves  hiding 
the  permuted  graph  in  a  larger  graph  (Grigoriev  and  Shpilrain  2008).  However,  in  this 
modifieation  of  the  protocol,  either  choiee  that  the  verifier  ean  make  for  c  requires  that  the 
prover  send  G'2  to  the  verifier,  whieh  leaves  the  larger  graph  as  an  unneeessary  addition  to  the 
protocol. 


^  Available  at  http://www-cs-faculty.stanford.edu/~knuth/sgb.html 
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3.1. 1.3  Discussion  of  Existing  Protocols 

Considering  the  algorithms  that  exist  for  the  graph  isomorphism  problem  (Nauty,  VF2,  etc.),  the 
protocol  ZKPl  is  not  very  secure.  For  example,  the  verifier  could  use  an  effective  graph 
isomorphism  algorithm  after  receiving  the  graph  G' 2  in  step  2.  This  enables  the  verifier  to 
uncover  the  isomorphism,  n~^\  H  ^  G^.  If  the  verifier  chooses  to  send  c  =  1  to  the  prover, 
then  7r(0(Gi))  is  revealed  by  the  prover,  and  so  the  verifier  has  available  both  7r“^  and 
7r(0(fii)).  This  would  allow  the  verifier  to  discover  0(fii),  and  using  the  graph  isomorphism 
algorithm  again  would  determine  0,  the  prover’s  private  input. 

3.1. 1.4  Establishing  a  Better  Protocol 

A  slight  modification  to  ZKPl  to  establish  a  more  secure  protocol  involves  committing  to  the 
permuted  graph  that  is  transferred  in  step  2.  This  alteration  works  fine  until  the  last  step  that 
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occurs,  in  the  case  that  the  verifier  ehooses  c  —  1.  If  the  verifier  chooses  c  =  1,  then  the  prover 
must  reveal  where  the  isomorphic  sub-graph  7r(0(Gi))  is  located  in  71(62).  order  for  the 
verifier  to  eheck  that  n((p(G^))  is  in  fact  a  sub-graph  that  is  isomorphic  to  the  verifier  must 
be  able  to  solve  this  particular  instance  of  the  graph  isomorphism  problem  very  quickly.  Thus 
for  this  change  in  protocol  to  be  effective,  we  must  be  sure  that  the  smaller  graph  involved  is  one 
in  whieh  some  graph  isomorphism  algorithm  works  well,  and  yet  the  problem  instance  as  a 
whole  must  be  difficult  for  all  sub-graph  isomorphism  algorithms,  which  is  not  an  easy  task. 

Another  small  adjustment  to  fix  the  faults  described  is  illustrated  in  Figure  4,  referred  to 
as  ZKP2.  In  this  protoeol,  the  prover  sends  the  permutation  07r  as  well  as  decommitment 
information  to  reveal  the  edges  of  the  sub-graph  n((p(G^)).  The  verifier  would  then  be  able  to 
cheek  that  7r(0(Gi))  is  isomorphic  to  G^.  Fortunately,  even  with  increasing  the  amount  of 
information  transferred,  the  zero-knowledge  property  is  still  satisfied.  Sinee  the  verifier  does  not 
know  TT  or  0  individually,  the  verifier  is  unable  to  determine  tt  or  0  alone  from  the  composition 
071.  Also,  since  the  verifier  is  only  shown  the  entries  of  the  permuted  adjacency  matrix  that 
correspond  to  edges  of  the  sub-graph,  the  verifier  cannot  uncover  the  initial  permutation  0  unless 
the  verifier  is  able  to  solve  the  sub-graph  isomorphism  problem. 

The  protocol  ZKP2  is  a  valid  zero-knowledge  proof  system  for  the  sub-graph 
isomorphism  problem.  (Note:  All  proofs  of  zero-knowledge  protoeols  in  this  report  are  based 
on  the  proof  style  of  Blum  (Blum  1986).) 


Claim:  ZKP2  is  a  zero-knowledge  proof  system  for  the  sub-graph  isomorphism  problem. 

Proof: 

Completeness:  If  the  prover  has  a  yes-instance  x  of  SGI,  then  the  verifier  will  accept  x 
with  probability  1 . 

Soundness:  If  the  prover  has  a  no-instance  y  of  SGI,  the  prover  will  be  caught  only  when 
the  verifier  ehooses  c  =  1.  Sinee  c  is  ehosen  uniformly  and  randomly  by  the  verifier,  the 
probability  that  the  verifier  will  rejeet  y  is  1/2  in  eaeh  round.  This  implies  that  the 

probability  that  the  verifier  does  not  reject  y  after  c  rounds  is  at  most  =2“^. 

Zero-Knowledge  Property:  Suppose  the  verifier  is  attempting  to  extract  useful 
information  from  his  conversation  with  the  prover.  Then  the  verifier  can,  in  the  same 
manner,  extract  the  same  information  even  without  the  aid  of  the  prover.  In  eaeh  round 
he  does  the  following: 
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Begin. 


Verifier  simulates  the  prover.  The  verifier  flips  a  fair  coin  and,  according  to  the 
outcome  of  the  coin,  commits  to  either  the  graph  G2  or  a  copy  of  embedded 
into  an  arbitrary  n-vertex  graph.  G2  is  committed  to  in  the  same  way  the  prover 
would  have  done  so.  The  sub-graph  is  committed  to  in  the  way  the  prover  would 
have  committed  to  such  an  isomorphic  sub-graph  in  G2 .  Then,  acting  as  the 
prover,  the  verifier  presents  the  committed  information.  Now  he  takes  the  other 
side. 


Common  Input: 
Private  Input: 


The  Payley  graph  (G2)  of  order  9  and  the  bull  graph  {GP). 

An  isomorphism  0  from  to  a  sub-graph  H  of  G2 
(shown  in  red). 


Prover  Verifier 

1 .  Chooses  an  isomorphism  n:  G-^  -»  G'2 . 

2.  Creates  an  adjacency  matrix  A  for  G\  (commitment) 

3.  Sends  T  to  the  verifier. 


A  = 


-0 

0 

1 

1 

0 

0 

0 

1 

1 

0 

0 

1 

1 

0 

0 

0 

1 
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1 

0 

1 

0 

0 

0 

0 

1 

1 
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1 
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0 

1 

1 

0 

1 

-1 

1 

0 

0 

1 

0 

0 

1 

0 

1 

1 

0 

0 

1 

0 

0 

1 

0- 

Chooses  a  random  bit  c. 
Sends  c  to  the  prover. 


6. 


6. 


7. 


c 


1 


Sends  n  and  the  decommitment 
information  for  A  to  the  verifier. 


Sends  the  decommitment 
information  for  the  entries  of  A 
that  correspond  to  edges  of 
n{(p(G{)')  to  the  verifier. 

Sends  the  permutation  (pn  to 
the  verifier. 


Checks  that7r(G2)  =  G'2  and  that  tt  is  a 
valid  isomorphism. 


-0 

0 

1 

1 

0 

0 

0 

1 

1- 

0 

0 

1 

0 

1 

0 

1 

0 

1 

1 

1 

0 

1 
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1 

0 
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1 

0 

0 

1 

1 

0 

0 

1 

0 

1 

0 

1 

0 

0 

0 

0 

1 

1 

0 

1 

-1 

1 

0 

0 

1 

0 

0 

1 

0- 

8.  Checks  that  7r(0(Gj^))  is  isomorphic  to  Gj^, 
and  that  7r(0(Gi))  is  what  was  revealed. 


Figure  4:  ZKP2  for  the  sub-graph  isomorphism  problem  example 
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Verifier  simulates  the  verifier.  The  verifier  guesses  randomly  and  uniformly 
whether  to  request  the  graph  or  an  isomorphic  sub-graph.  Because  the  verifier  has 
no  way  to  guess  with  any  advantage  whether  the  committed  matrix  contains  the 
graph  or  an  isomorphic  sub-graph  (because  the  choice  is  random),  there  is  a  50% 
chance  that  he  requests  an  option  (graph  or  sub-graph)  that  the  verifier,  in  the 
guise  of  prover,  can  supply.  If  not,  the  verifier  backs  up  the  simulation  to  the 
state  it  was  in  at  the  start  of  this  round  and  restarts  the  entire  round  (verifier 
simulating  the  prover). 

End. 

In  an  expected  2  passes  through  each  round,  the  verifier  will  obtain  the  information 
without  the  help  of  the  prover.  Thus  the  interaction  does  not  help  the  verifier  do 
something  with  the  prover  in  expected  polynomial  time  that  he  could  not  as  well  have 
done  without  the  prover  in  expected  polynomial  time.  ■ 

Consider  the  zero-knowledge  proof  system  ZKP2  for  the  sub-graph  isomorphism 
problem.  This  protocol  shows  the  basic  structure  of  all  of  the  protocols  in  this  section.  Figure  5 
illustrates  the  protocol  in  the  case  that  the  prover  is  attempting  to  cheat.  The  prover  does  not 
have  a  valid  isomorphism  from  to  a  sub-graph  of  G2,  and  the  verifier  must  catch  this. 

In  order  for  these  zero -knowledge  proof  systems  to  be  of  use,  we  must  determine  the  total 
number  of  bits  to  be  transferred.  In  ZKP2,  the  graphs  that  we  are  considering  are  simple, 
undirected  graphs.  This  implies  that  the  adjacency  matrices  will  be  symmetric  with  zeros  along 
the  diagonal  and  with  all  entries  equal  to  either  0  or  1 .  Thus  in  a  graph  with  n  vertices,  the 
prover  only  needs  to  transmit  (”)  entries  of  A  to  the  verifier,  where  n  =  |G|.  Hence  step  3 
requires  the  transmission  of  (”)  entries,  each  of  which  is  one  bit.  In  step  5,  the  verifier  sends  one 
bit.  If  the  verifier  chooses  c  —  0,  the  prover  must  send  the  isomorphism  n.  We  can  send  this  in 
list  form,  and  so  we  need  n  log2  n  bits.  If  the  verifier  chooses  c  =  1,  the  prover  must  identify 
the  isomorphism  n(^(p(Gf))  and  must  send  the  permutation  (pn  to  the  verifier.  This  requires 
sending  decommitment  information  for  the  edges  corresponding  to  7r(0(Gi))  and  also  sending  a 
permutation  in  list  form  with  n  entries. 


When  considering  the  maximum  number  of  bits  that  will  be  necessary  in  ZKP2  not 
including  what  is  needed  for  commitment  methods,  the  number  transferred  will  be: 


(D  +  l+  n  log2  n 


(6) 


Approved  for  Public  Release;  Distribution  Unlimited. 
15 


Common  Input: 
Private  Input: 


The  Payley  graph  (G2)  of  order  9  and  the  bull  graph  (Gj^). 

An  invalid  isomorphism  0  from  G^  to  a  sub-graph  H  of 
G2  (shown  in  red). 


Prover 


Verifier 


1.  Chooses  an  isomorphism  n:  G-^  -»  G'2. 

2.  Creates  an  adjacency  matrix  A  for  G'2 

3.  Sends  to  the  verifier. 


A  = 


(commitment) 
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0- 

4.  Chooses  a  random  bit  c. 

5.  Sends  c  to  the  prover. 


6.  Sends  n  and  the  decommitment 
information  for  A  to  the  verifier. 


7.  Checks  that7r(G2)  =  G'2  and  that  tt  is  a 
valid  isomorphism. 


c  =  1 


6.  Sends  the  decommitment 
information  for  the  entries  of  A 
that  correspond  to  edges  of  _ 
7r(0(Gj^))  to  the  verifier. 

7.  Sends  the  permutation  (pir  to 
the  verifier. 


A  = 
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1 

0 
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1 

-1 

1 

0 

0 

1 

0 

0 

1 

0- 

Sees  that  n{(p{G-^')  is  not  isomorphic  to  G^ 
and  so  rejects  the  prover. 


Figure  5:  ZKP2  for  sub-graph  isomorphism  with  cheating  prover  example 


If  the  maximum  amount  of  information  to  be  transmitted  is  10  kilobits,  then  we  must  have: 

g)  +  1 +nlog2n  <  10000  (7) 

n  <  134  (8) 

Thus  the  largest  instance  that  could  be  considered  would  have  at  most  134  vertices  in  the  larger 
of  the  two  graphs.  Note  that  this  maximum  occurs  given  any  choice  that  the  verifier  makes  for  c. 
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3.1.2  Graph  Isomorphism  Problem 


The  graph  isomorphism  problem  (GIP)  is  stated  as  follows:  Given  two  graphs  and  G2,  is 
there  an  isomorphism  0:  Gi  G2?  The  GIP  is  known  to  belong  to  the  class  NP,  but  it  has  not 
been  determined  to  be  NP-complete.  It  is  conjectured  that  the  GIP  falls  somewhere  outside  of 
the  classes  P  and  NP-complete  (Conte,  et  al.  2004). 

3.1.2.1  Algorithms 

The  three  main  algorithms  for  solving  the  graph  isomorphism  problem  are  Nauty  (1981), 
Ullman’s  algorithm  (1976),  and  VF2  (2004).  All  three  algorithms  are  able  to  solve  instances  of 
the  problem  at  remarkable  speeds.  However,  VF2  seems  to  consistently  outperform  Ullman’s 
algorithm  (Cordelia,  et  al.  2004),  so  the  focus  of  this  section  will  be  on  Nauty  and  VF2. 

The  Nauty  algorithm,  created  by  Brendan  McKay  (McKay  1981),  uses  a  large  amount  of 
group  theory  to  determine  a  canonical  labeling  of  the  graphs  (Fortin  1996).  The  main  idea  of  the 
algorithm  is  then  centered  on  the  fact  that  if  the  labelings  of  the  two  graphs  are  the  same,  then  the 
graphs  must  be  isomorphic.  VF2,  on  the  other  hand,  relies  upon  backtracking  and  pruning  the 
search  space  according  to  some  specified  feasibility  rules  (Cordelia,  et  al.  2004). 

In  comparing  VF2  and  Nauty,  neither  algorithm  clearly  outperforms  the  other.  In  the 
results  of  a  set  of  tests  comparing  the  two  algorithms  on  three  different  classes  of  graphs  ranging 
from  20  to  1,000  vertices,  Nauty  appears  to  be  more  effective  on  random  graphs,  while  VF2  is 
more  effective  on  2D-mesh  graphs  and  bounded  valence  graphs  (Cordelia,  et  al.  2004).  In 
further  testing,  it  is  shown  that  on  all  benchmark  classes  of  graphs  that  were  selected  with  a 
maximum  of  1 100  vertices,  at  least  one  of  VF2  and  Nauty  can  solve  the  problem  instance  in  less 
than  one  second  (Foggia,  Sansone  and  Vento  2001). 

While  it  has  not  yet  been  determined  which  classes  of  graphs  the  algorithms  Nauty  and 
VF2  struggle  with,  one  idea  has  appeared  in  the  literature  (Fortin  1996),  (Hernandez-Goya  and 
Caballero-Gil  2004).  It  is  possible  to  create  hard  instances  of  the  GIP  by  swapping  the  endpoints 
of  two  different  edges  in  a  highly  symmetric  regular  graph.  It  is  reported  that  the  resulting  graph 
will  be  several  hundred  times  harder  for  Nauty  (Fortin  1996). 


3.1.2.2  Discussion  of  Existing  Protocols 

Given  the  efficiency  of  the  existing  algorithms,  the  graph  isomorphism  problem  will  be 
difficult  to  use  as  a  base  problem  for  a  secure  protocol.  However,  we  will  discuss  two  types  of 
zero-knowledge  proof  systems  for  the  graph  isomorphism  problem.  The  first  type  of  zero- 
knowledge  protocol  that  exists  for  the  graph  isomorphism  problem  is  identical  to  that  of  the  sub¬ 
graph  isomorphism  problem.  The  only  difference  between  protocols  for  the  two  problems  is  that 
the  sub-graph  is  no  longer  a  proper  sub-graph,  but  the  entire  graph,  i.e.  H  —  G2. 
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The  second  type  of  zero-knowledge  proof  system  works  only  for  the  graph  isomorphism 
problem.  The  protocol  takes  as  public  input  two  graphs  and  G2  and  as  private  input  an 
isomorphism  0:  G2.  First,  the  prover  creates  an  isomorphic  copy  G'2  of  G2  (say  G'2  = 

0(^2))  and  sends  the  copy  to  the  verifier.  The  verifier  chooses  a  challenge  bit  and  sends  that 
choice  to  the  prover.  If  the  verifier  sent  a  challenge  bit  equal  to  zero,  then  the  prover  sends  0  to 
the  verifier  and  the  verifier  checks  that  G'2  —  0(^2)-  If  verifier  sent  a  challenge  bit  equal  to 
one,  then  the  prover  sends  00  to  the  verifier,  who  checks  that  0(0(Gi))  =  G'2  (Goldreich, 
Micali  and  Wigderson  1991),  (Hemandez-Goya  and  Caballero-Gil  2004),  (Simari  2002), 
(Grigoriev  and  Shpilrain  2008). 

As  mentioned,  the  ease  with  which  the  current  algorithms  are  able  to  solve  exactly  the 
graph  isomorphism  problem  makes  these  protocols  mostly  useless.  Unless  a  class  of  difficult 
instances  is  determined,  the  protocols  are  not  secure,  even  though  they  satisfy  the  necessary 
properties  for  a  zero-knowledge  proof  system. 

3.1.3  Graph  Clustering  Problem 

The  graph  clustering  problem  (GCP)  is  a  more  general  case  of  both  the  graph  isomorphism  and 
the  graph  non-isomorphism  problem  (the  complement  of  the  graph  isomorphism  problem).  The 
GCP  as  defined  as  follows  (Goldreich  1996):  Given  a  sequence  of  graphs  (Gi, ...  ,  G„},  and  a 
sequence  of  positive  integers  {rui, ...  ,  m/j],  does  there  exist  a  partition  G^, ...  ,  G/^  of  [n]  such  that: 

1.  |Gj|  =  ruj  for  i  = 

2.  For  all  i  G  [/c]  and  every  x,  y  G  Gj,  the  graphs  Gy.  and  Gy  are  isomorphic. 

3.  For  all  X  A  y  G  [/c]  and  all  a  E  Gy.  and  all  b  G  Gy,  the  graphs  G^  and  G^  are  not 
isomorphic. 

In  other  words,  we  are  looking  to  determine  if  under  the  equivalence  relation  of  graph 
isomorphisms,  the  sizes  of  the  equivalence  classes  are  represented  by  the  given  sequence  of 
positive  integers. 

3.1.3.1  Existing  Zero-Knowledge  Proofs 

The  following  noninteractive  zero-knowledge  protocol  for  GCP  relies  upon  several  foundational 
theorems.  The  first  theorem  states  that  we  can  construct  a  monotone  formula  that  determines  the 
value  of  THRESHf-  ji(^Xj^, ...  ,Xji)  in  polynomial-time,  where  THRESH^  j^  is  a  Boolean  function 
with  each  Xj  being  a  Boolean  variable  that  returns  true  if  and  only  if  at  least  t  of  the  n  variables 
Xj  are  true.  The  second  and  third  theorems  state  that  there  exists  a  perfect  zero-knowledge  proof 
system  for  all  instances  of  true  monotone  formulae  over  statements  related  to  graph  (non- 
)isomorphism  (DeSantis,  Di  Crescenzo  and  Persiano,  et  al.  1994). 
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The  zero-knowledge  proof  system  discussed  below,  published  by  (DeSantis,  Di  Crescenzo  and 
Goldreich,  et  al.  1999),  takes  as  public  input  a  sequence  of  n  graphs  and  a  sequence  of  k  positive 
integers. 


1 .  The  prover  P  proves  that  the  equivalence  relation  has  at  least  k  equivalence  classes. 

Determining  that  at  least  k  -  1  graphs  are  non-isomorphic  to  all  earlier  graphs  in  the 
initial  sequence  proves  this  statement.  To  accomplish  this,  we  use  the  first  and  third 
theorems  to  prove  in  zero -knowledge  that  ...  ,/^),  where 

fi  —  i^j<i  (^i  is  not  isomorphic  to  fiy)). 

2.  P  proves  that  the  equivalence  relation  has  at  most  k  equivalence  classes. 

Determining  that  at  most  (n  -  1)  -  (/c  -  1)  graphs  are  isomorphic  to  all  earlier  graphs 
in  the  initial  sequence  proves  this  statement.  To  accomplish  this,  we  use  the  first  and 
second  theorems  to  prove  in  zero-knowledge  that  T2  =  2,  —  ,f'n), 

where  f'l  —  (yj<i{Gi  is  isomorphic  to  Gy)). 

3.  P  proves  that  at  least  a  certain  number  of  equivalence  classes  have  a  given  minimum 
size. 

We  first  define  Sq  =  0,  such  that  (s^, ...  ,5^}  =  {rui, ...  and 

define  rij  =  \{i  ■  rrij  =  Sj}|  for  each  i  =  1, ...,  d.  P  proves  the  statements: 

1/3  j  =  'at  least  x  =  -f  — h  n^-j+i  classes  have  size  at  least y  = 
for  i  =  1, ... ,  d 

This  is  done  by  proving  in  zero-knowledge  =  THRESHn_i^+xn(g-j^, ...  ,  g^)  where: 
gi  =  (.vj^iiGi^Gj))v(THRESHy^^aGi^G^) . (Gi^Gn)))- 

4.  P  proves  that  at  least  a  certain  number  of  equivalence  classes  have  a  given  maximum 
size. 

Using  the  same  definitions  as  in  step  3,  P  proves  the  statements: 

1/4  j  =  'at  least  x  =  %  -f  — h  Uj  classes  have  size  at  most  y  —  Si.’  for  i  =  1, ... ,  d 

This  is  accomplished  by  proving  in  zero-knowledge  = 

THRESHx,nig’i,-  ,g’n)  where: 

g'i  =  (^j<i(Gi  ^  Gj))  V  (THRESHn-y,naGi  ^  GJ . (G^  ^  GJ)). 
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To  prove  that  this  is  in  fact  a  noninteractive  zero-knowledge  protocol,  we  note  that  and 
T2  hold  if  and  only  if  there  are  exactly  k  equivalence  classes  in  the  sequence  of  graphs.  Also, 
through  basic  algebraic  manipulation  and  induction,  it  is  easily  proven  that  statements  and 
hold  for  every  i  if  and  only  if  the  k  equivalence  classes  have  the  correct  sizes  as  specified  by  the 
sequence  of  positive  integers  (DeSantis,  Di  Crescenzo  and  Goldreich,  et  al.  1999).  Since  the 
protocol  is  a  composition  of  zero-knowledge  protocols  based  on  the  THRESH  function,  the 
protocol  is  also  zero-knowledge. 

3.1.3.2  Discussion  of  Problem 

It  does  not  appear  to  have  been  discussed  in  the  literature  as  to  whether  GCP  is  an  NP-complete 
problem  or  not.  It  is  clear  that  it  lies  in  the  class  NP,  as  given  a  true  instance  of  the  GCP,  a 
witness  for  the  problem  is  a  set  defining  the  partitions  together  with  a  set  of  isomorphisms  from 
each  graph  to  another  in  the  same  partition  class,  and  this  witness  can  be  easily  verified.  We  do 
note,  however,  that  when  our  sequence  of  positive  integers  is  (1, 1}  then  the  problem  is  an 
instance  of  the  graph  non-isomorphism  problem,  and  when  our  sequence  of  positive  integers  is 
{2}  then  the  problem  becomes  an  instance  of  the  graph  isomorphism  problem.  Thus  we  can  see 
that  GCP  is  at  least  as  hard  as  the  graph  isomorphism  and  graph  non-isomorphism  problems,  and 
that  determining  the  complexity  of  the  graph  (non-)isomorphism  problem  will  determine  the 
complexity  of  the  graph  clustering  problem. 

While  no  information  has  been  found  yet  as  far  as  algorithms  for  solving  the  graph 
clustering  problem,  it  should  be  noted  that  the  problem  can  be  solved  by  repeatedly  applying  any 
algorithm  for  solving  the  graph  isomorphism  problem.  In  the  worst  case,  each  graph  would  be  in 
a  separate  equivalence  class.  This  would  then  imply  that  any  graph  isomorphism  algorithm 
would  need  to  be  applied  to  the  instance  fewer  than  n(n  -  l)/2  times  in  order  to  determine  the 
equivalence  classes  (each  graph  needs  only  to  be  compared  to  one  graph  in  each  equivalence 
class  determined  before).  Thus  in  instances  where  the  decision  version  of  the  graph 
isomorphism  problems  involved  can  be  determined  in  under  t  seconds,  the  entire  graph 
clustering  problem  instance  could  be  solved  in  under  n^t  seconds.  Due  to  the  significantly 
increased  amount  of  information  needed  for  the  problem,  namely  the  sequence  of  graphs,  the 
graph  clustering  problem  is  most  likely  not  a  good  candidate  for  a  security  protocol. 


3.1.4  Independent  Set  Problem 

The  independent  set  problem  (ISP)  is  stated  as  follows:  Given  a  graph  G  and  an  integer  k,  does 
G  contain  an  independent  set  of  size  k?  This  question  is  the  decision  version  of  an  optimization 
problem  known  as  the  maximum  independent  set  problem.  The  optimization  (maximum) 
independent  set  problem  is  as  follows:  Given  a  graph  G,  what  is  the  size  of  a  maximum 
independent  set  in  G?  The  optimization  problem  is  an  NP-hard  problem  and  the  decision  version 
is  a  well-known  NP-complete  problem  (Garey  and  Johnson  1979). 
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Another  NP-complete  problem  that  is  equivalent  to  the  ISP  is  the  maximum  clique 
problem  (MCP).  The  MCP  is  stated  as  follows:  Given  a  graph  G  and  an  integer  k,  does  G 
contain  a  clique  of  size  /c?  The  equivalence  of  the  two  problems  can  be  seen  clearly  when  we 
observe  that  the  complement  of  an  independent  set  is  a  clique  and  vice  versa.  Thus  given  any 
instance  of  the  ISP,  we  can  easily  convert  it  to  an  instance  of  the  MCP  merely  by  considering  the 
complement  of  the  graph  in  question. 

A  problem  closely  related  to  the  ISP  is  the  /c-independent  set  problem  (KIS).  The 
problem  is  stated  as  follows:  Given  a  graph  G  and  positive  integers  j  and  k,  does  there  exist  a  k- 
independent  set  (a  set  of  vertices  such  that  between  any  two  distinct  vertices  in  the  set,  the  length 
of  the  shortest  path  between  them  is  at  least  k)  of  size  7?  The  KIS  is  an  NP-complete  problem  - 
a  fact  that  is  clearly  seen  when  we  observe  that  the  ISP  is  a  subproblem  of  the  KIS  (Desmedt  and 
Wang  2003). 


3.1.4.1  Algorithms 

Several  near-optimal  algorithms  have  been  proposed  to  deal  with  the  ISP  and  the  MCP.  In  a 
relatively  recent  publication,  it  was  reported  that  the  most  competitive  algorithms  are  DLS 
(Dynamic  Local  Search),  RLS  (Reactive  Local  Search),  and  VNS  (Variable  Neighborhood 
Search)  (Grosso,  Locatelli  and  Pullan  2008).  While  these  algorithms  are  geared  towards  the 
MCP,  the  equivalence  of  the  MCP  and  the  ISP  allows  the  algorithms  to  be  used  easily  on  either 
problem. 

DLS-MC,  a  DLS  variant,  was  introduced  in  2006  (Pullan  and  Hoos  2006)  and  is  based  on 
stochastic  local  search.  It  assigns  penalty  values  to  the  vertices  in  order  to  help  the  algorithm 
avoid  cycling  around  local  optima.  The  creators  conclude  from  their  testing  that  the  DLS-MC 
outperforms  several  older  algorithms  and  improves  upon  the  previously  existing  DLS  algorithms. 
The  RLS  algorithm  was  improved  upon  in  2007,  and  so  has  been  replaced  by  R-Evo  and  RLS- 
Evo.  These  modified  RLS  algorithms  both  begin  by  obtaining  an  initial  estimate,  after  which  a 
better  solution  is  searched  for.  They  employ  a  model-based  approach  in  which  the  current 
solution  is  used  to  provide  information  about  possible  locations  of  a  better  solution  (Battiti  and 
Brunato  2007).  An  efficient  algorithm  that  often  outperforms  RLS  is  KLS.  KLS  is  based  on  the 
technique  of  variable  depth  search,  a  variation  of  local  search,  and  proceeds  by  adding  and 
removing  vertices  from  the  current  clique  in  order  to  find  a  larger  one  (Katayama,  Hamamoto 
and  Narihisa  2005). 

The  general  result  of  the  published  material  on  ISP  or  MCP  algorithms  is  that  there  is  no 
“best”  algorithm  for  every  instance  of  the  problem.  Eortunately,  there  is  a  standard  set  of 
benchmark  graphs  that  most  algorithms  are  tested  and  compared  on.  These  benchmarks,  known 
as  the  DIMACS  benchmark  instances  for  the  maximum  clique  problem,  originated  from  The 
Second  DIMACS  Implementation  Challenge:  1992-1993  (Johnson  and  Trick  1996).  The 
DIMACS  graphs  range  in  size  from  under  100  vertices  to  4,000  vertices,  however  it  is  not  clear 
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what  determines  the  diffieulty  level  of  the  graphs.  Reviewing  the  published  results,  it  appears 
that  almost  every  DIMACS  benehmark  graph  ean  be  solved  for  the  best  known  solution  in  less 
than  200  seeonds. 


3.1.4.2  Existing  Zero-Knowledge  Proofs 

The  zero-knowledge  proof  systems  for  the  independent  set  problem  take  as  publie  input  a  graph, 
G,  and  a  positive  integer  k,  and  as  private  input  a  set  S  ^  V(G),  where  S  is  an  independent  set.  A 
zero-knowledge  proof  system  (ZKP3)  for  the  ISP  is  illustrated  in  Figure  6  (Desmedt  and  Wang 
2003).  The  prover  (P)  chooses  randomly  an  isomorphism  tt  to  permute  the  graph  G  and  then 
sends  a  commitment  to  this  new  graph,  ^(G),  to  the  verifier  (V).  V  then  chooses  a  random  bit  c, 
which  is  sent  to  P.  If  c  =  0,  P  sends  tt  to  V  along  with  the  decommitment  information  for  'n:(G), 
and  V  checks  that  'n:(G)  was  formed  correctly  from  G  and  tt.  If  c  =  1,  then  P  sends  the 
decommitment  information  for  tt(G[S])  to  V,  who  checks  that  tt(G[S])  has  all  entries  equal  to 
zero.  This  then  implies  that  S  is  an  independent  set. 

3.1.4.3  Discussion  of  Existing  Protocols 

When  we  consider  the  soundness  property  of  ZKP3,  a  cheating  prover  with  a  no-instance  of  the 
problem  will  only  be  caught  when  the  verifier  chooses  c  =  1.  As  the  verifier  chooses  c 
randomly  and  uniformly  from  the  set  {0,1},  the  probability  that  a  cheating  prover  will  be  caught 
in  each  round  is  1/2.  Thus  the  probability  that  a  verifier  will  not  reject  a  cheating  prover  after  k 

rounds  is  Qj  ,  and  so  the  soundness  property  is  satisfied.  The  protocol  ZKP3  also  satisfies  the 

completeness  and  zero-knowledge  properties  and  therefore  is  a  zero-knowledge  proof  system 
(Desmedt  and  Wang  2003). 

Because  of  the  equivalence  between  the  independent  set  problem  and  the  maximum 
clique  problem,  ZKP3  can  be  slightly  modified  to  give  a  valid  zero-knowledge  proof  system  for 
the  MCP.  The  only  change  that  needs  to  be  made  is  in  the  last  step  that  occurs  after  a  verifier 
chooses  c  —  1.  Instead  of  checking  that  every  transferred  matrix  entry  is  zero,  the  verifier  must 
check  that  every  entry  that  is  not  along  the  diagonal  is  equal  to  one.  This  then  demonstrates  that 
the  sub-graph  revealed  is  in  fact  a  complete  graph,  as  every  pair  of  distinct  vertices  has  an  edge 
between  them. 
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The  protocol  ZKP3  can  also  be  altered  to  handle  the  KIS.  First,  define  the  set  Eq  to  be 
the  set  of  all  pairs  of  vertices  (it,  v)  such  that  the  length  of  the  shortest  path  between  u  and  v  in 
G  is  at  most  k  —  1,  and  let  Vq  =  F((j).  Then  we  define  =  (Yg.Eq).  A  set  V'  ^  V{G^  is  a  k- 
independent  set  in  G  if  and  only  if  F'  is  an  independent  set  in  Gi^.  By  using  as  the  common 
input  to  ZKP3,  we  have  a  zero-knowledge  proof  system  for  the  KIS  (Desmedt  and  Wang  2003). 


Approved  for  Public  Release;  Distribution  Unlimited. 
23 


3.1.5  Longest  Path  Problem 


The  longest  path  problem  (LPP)  is  stated  as  follows:  Given  a  graph  G  and  a  positive  integer  k, 
does  G  eontain  a  path  of  length  /c?  (We  are  using  the  assumption  that  the  length  of  a  path  is  the 
number  of  edges  of  the  path.)  Figure  7  illustrates  an  example  of  the  LPP.  The  more  eommonly 
known  version  of  the  longest  path  problem  is  an  optimization  problem  that  asks  for  a  witness  for 
the  value  max  {P  a  path:  P  G  G).  Our  phrasing  of  LPP  is  merely  the  deeision  version  that 
eorresponds  to  the  optimization  problem.  The  LPP  is  an  NP-eomplete  problem,  and  eontains  the 
Hamiltonian  path  problem  as  a  subproblem.  However,  the  LPP  is  a  more  diffieult  problem  than 
the  Hamiltonian  path  problem  as  the  longest  path  in  the  graph  does  not  neeessarily  travel  through 
every  vertex.  It  is  an  easy  reduetion  from  the  Hamiltonian  path  problem  to  the  LPP,  and  hence 
the  NP-completeness  is  clear.  The  optimization  version  of  the  problem  is  NP-hard.  There  are 
few  graph  classes  that  are  known  to  be  easily  solvable  (in  polynomial  time).  One  class  of  graphs 
that  can  be  solved  quickly  is  the  class  of  directed  acyclic  graphs  (Garey  and  Johnson  1979). 
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3.1.5.1  Algorithms 


There  are  few  algorithms  that  are  eapable  of  eoping  with  the  LPP.  Even  approximation 
algorithms  are  diffieult  to  eome  by,  as  the  optimization  problem  assoeiated  with  the  LPP  is 
thought  to  lie  outside  of  the  elass  of  problems  APX  -  the  elass  of  optimization  problems  for 
whieh  polynomial-time  approximation  algorithms  with  approximation  ratios  bounded  by 
eonstants  exist  (Bjorklund  and  Husfeldt  2003).  In  faet,  it  has  been  proven  that  the  longest  path 
problem  must  lie  outside  of  APX  unless  P  =  NP  (Karger,  Motwani  and  Ramkumar  1997). 

It  seems  that  one  of  the  best  performing  algorithms  eurrently  is  a  hybrid  depth-first- 
seareh  algorithm  that  produees  either  an  exaet  solution  to  the  problem  instanee  in  0(2”/“*^”^) 
time,  where  0(^(n))  =  0{g(ji)  log^^(n))  for  some  k,  or  an  0(a(n)  logn)-approximation,  for 
any  a  that  is  an  unbounded  funetion  (Vassilevska,  Williams  and  Woo  2006).  Another  possible 
option  is  applying  a  sub-graph  isomorphism  algorithm  to  the  problem,  sinee  the  length  of  the 
longest  path  will  be  available  as  eommon  knowledge  in  the  zero-knowledge  proof  system 
eonsidered  for  this  problem. 


3.1.5.2  Establishing  a  Protocol 

Beeause  the  LPP  is  a  subproblem  of  the  SGI,  we  ean  modify  ZKP2  slightly  to  obtain  ZKP4,  a 
zero-knowledge  proof  system  for  the  longest  path  problem.  The  eommon  inputs  to  the  protoeol 
are  a  graph,  G,  and  a  positive  integer,  k,  whieh  represents  the  length  of  a  longest  path  in  G.  The 
private  input  is  the  longest  path  itself. 

Ligure  8  illustrates  a  zero-knowledge  proof  system,  ZKP4,  for  an  instanee  of  the  LPP. 
The  prover  (P)  ehooses  randomly  an  isomorphism  n  to  permute  the  graph  G  and  then  sends  a 
eommitment  to  this  new  graph,  n(^G),  to  the  verifier  (V).  V  then  ehooses  a  random  bit  c,  whieh 
is  sent  to  P.  If  c  =  0,  P  sends  tt  to  V  along  with  the  deeommitment  information  for  7r(G),  and  V 
eheeks  that  7r(G)  was  formed  eorreetly  from  G  and  n.  If  c  =  1,  then  P  sends  the  deeommitment 
information  for  7r(P)  to  V  (where  7r(P)  represents  the  entries  eorresponding  to  the  edges  of  the 
path  that  is  the  private  input)  and  V  eheeks  that  7r(P)  forms  a  path  of  the  speeified  length. 

Note  that  in  ZKP4  the  prover  does  not  need  to  send  any  information  in  addition  to  the 
edges  of  the  path  to  the  verifier.  The  permutations  used  by  the  prover  are  unneeessary 
information  for  the  verifier,  as  eheeking  that  the  edges  revealed  form  a  path  is  a  simple  task 
without  the  knowledge  of  the  permutations.  The  prover  also  does  not  need  to  identify  the 
vertiees  that  are  endpoints  on  the  path,  as  the  verifier  ean  determine  these  from  the  revealed 
entries  by  examining  whieh  rows  and  eolumns  have  one  and  only  one  entry  equal  to  1 . 
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Common  Input: 
Private  Input: 


The  complete  binary  tree  (G)  of  order  7  (shown) 
and  a  positive  integer  k  —  4. 

A  path  of  length  k  (shown  in  red) 


Prover 


Verifier 


1.  Chooses  an  isomorphism  n:  G  G'. 

2.  Creates  an  adjacency  matrix  ri  for  G  ’ 

3.  Sends  to  the  verifier. 
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,4.  Chooses  a  random  bit  c. 
*5.  Sends  c  to  the  prover. 


c  =  0 


6.  Sends  n  and  the  decommitment 
information  for  A  to  the  verifier. 


7.  Checks  that  7r(G)  =  G'. 


6.  Sends  the  decommitment 
information  for  the  entries  of 
A  that  correspond  to  the  edges 
of  7r(P)to  the  verifier. 


A  = 


c  =  1 
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7.  Checks  all  entries  of  7r(P)  are  equal  to  0. 


Figure  8:  ZKP4  for  the  longest  path  problem  example 


3.1.6  Hamiltonian  Cycle  Problem 

The  Hamiltonian  cycle  problem  (HCP)  is  stated  as  follows:  Given  a  graph  G,  does  G  contain  a 
Hamiltonian  cycle  (a  cycle  that  passes  through  every  vertex  of  the  graph  once  and  only  once)? 
The  HCP  is  one  of  the  best  known  NP-complete  problems,  and  is  used  often  in  proving  other 
problems  to  be  NP-complete  (Garey  and  Johnson  1979).  There  are  several  closely  related  NP- 
complete  problems,  such  as  the  Hamiltonian  path  problem,  the  directed  Hamiltonian  cycle 
problem,  and  the  Hamiltonian  path  between  two  points.  Some  cases  of  the  HCP  are  known  to  be 
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easy  (solvable  in  polynomial-time),  sueh  as  if  G  has  no  vertex  with  degree  greater  than  two  or  if 
G  is  a  line  graph. 


3.1.6.1  Traveling  Salesman  Problem 


The  HCP  has  a  very  well-known  subproblem:  The  Traveling  Salesman  Problem  (TSP).  The 
TSP  is  stated  as  follows:  Given  a  graph  G'  with  weighted  edges,  find  a  Hamiltonian  eyele  with 
the  minimum  total  weight  possible.  Given  an  instanee  of  the  HCP,  it  is  easy  to  ereate  an  instanee 
of  the  TSP.  Let  G  be  an  instanee  of  HCP.  Construet  G'  from  G  as  follows:  Let  P(G0  =  L(G) 
and  define  the  edge  set  as  E(G')  =  {xy:  x,y  E  V(G')}.  Assign  edge  weights  as  follows:  For 
e  G  F(G'), 


w 


e  G  E(G')\E(G) 
e  G  E(G) 


(9) 


If  the  minimum  tour  weight  of  G'  is  equal  to  |F(G)  |,  then  the  graph  G  has  a  Hamiltonian 
eyele.  The  TSP  is  an  NP-hard  optimization  problem,  and  the  deeision  version  of  the  problem 
(does  G'  have  a  tour  with  total  weight  less  than  or  equal  to  some  value  K)  is  an  NP-eomplete 
problem  (Garey  and  Johnson  1979). 


3.1.6.2  Algorithms 

Sinee  the  Hamiltonian  eyele  problem  is  a  speeifie  ease  of  both  the  sub-graph  isomorphism 
problem  and  the  traveling  salesman  problem,  any  algorithm  for  solving  the  SGI  or  the  TSP  will 
also  work  to  solve  the  Hamiltonian  eyele  problem.  As  the  TSP  is  such  a  well-known  and  well- 
researched  problem,  it  is  highly  likely  that  the  best  performing  algorithms  for  the  HCP  will  in 
fact  be  TSP  algorithms. 

A  popular  TSP  algorithm  is  the  Lin-Kemighan  (LK)  algorithm.  The  LK  algorithm  starts 
with  an  arbitrary  trail  that  reaches  all  vertices  of  the  graph  (and  may  include  passing  through 
some  vertices  more  than  once).  It  then  switches  paths  on  the  trail  in  order  to  shorten  it  if  possible 
(Marinakis,  Migdalas  and  Pardalos  2005). 

Concorde,  an  exact  algorithm  for  the  TSP,  is  able  to  solve  optimally  106  out  of  the  110 
instances  of  the  TSP  in  the  TSPLIB  (a  publicly  available  library  of  problem  instance  for  the 
TSP).  Of  these  instances,  the  largest  involves  15,112  cities  (Skiena  2008).  It  is  also  reported 
that  for  the  six  instances  from  TSPLIB  with  between  1000  and  1200  nodes,  an  algorithm  known 
as  the  LKH  algorithm  (Helsgaun  2000)  is  able  to  obtain  a  solution  that  is  no  more  than  0.2% 
from  the  optimal  in  less  than  20  seconds  (Johnson  and  McGeoch,  Experimental  Analysis  of 
Herustics  for  the  STSP  2002).  Because  of  this,  the  instances  considered  for  testing  of  the 
algorithms  for  The  Eighth  DIMACS  Implementation  Challenge  (2001)  did  not  include  any  with 


^  Available  at  http://elib.zib.de/pub/mp-testdata/tsp/tsplib/tsplib.html 
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fewer  than  1000  nodes  (Johnson  and  McGeoch  2002).  These  benchmark  instances  appear  to  be  a 
mixture  of  real-world  and  randomly  generated  problems. 


3.1.6.3  Existing  Zero-Knowledge  Proofs 

The  zero-knowledge  proof  systems  for  the  HCP  have  as  common  input  a  graph,  G,  that  contains 
a  Hamiltonian  cycle,  and  as  private  input  a  Hamiltonian  cycle,  C,  in  G  (note  that  G  may  contain 
more  than  one  Hamiltonian  cycle).  Figure  9  illustrates  a  zero-knowledge  proof  system,  ZKP5, 


Common  Input: 
Private  Input: 


The  platonic  graph  of  the  cube  (G)  of  order  8 
(shown) 

A  Hamiltonian  cycle  (shown  in  red) 


Figure  9:  ZKP5  for  the  Hamiltonian  cycle  problem  example 
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for  an  instance  of  the  HCP  (Blum  1986).  The  prover  (P)  ehooses  randomly  an  isomorphism  n  to 
permute  the  graph  G  and  then  sends  a  commitment  to  this  new  graph,  to  the  verifier  (V). 

V  then  chooses  a  random  bit  c,  whieh  is  sent  to  P.  If  c  =  0,  P  sends  tt  to  V  along  with  the 
deeommitment  information  for  n(^G),  and  V  eheeks  that  7r(G)  was  formed  eorreetly  from  G  and 
TT.  If  c  =  1,  then  P  sends  the  deeommitment  information  for  7r(C)  to  V  (where  7r(C)  represents 
the  entries  eorresponding  to  the  edges  of  the  eyele  that  is  the  private  input)  and  V  eheeks  that 
7r(C)  forms  a  Hamiltonian  cycle. 

The  zero-knowledge  proof  system  ZKP5  is  just  one  possible  protocol  for  the  HCP.  A 
similar  protoeol  has  been  published,  and  the  main  differenee  is  a  reliance  on  hashing  to  hide  the 
information  that  is  eommitted  to  in  ZKP5  (Caballero-Gil  and  Hemandez-Goya  2006).  There  is 
also  available  a  third  protocol  that  assumes  that  families  of  eollision-free  hash  funetions  exist  in 
order  to  provide  a  statistical  noninteractive  zero-knowledge  argument  with  preproeessing 
(Damgard  1992). 

3.1.6.4  Discussion  of  Existing  Protocols 

While  it  may  seem  tempting  to  use  the  same  HCP  protoeol  (ZKP5)  for  the  TSP,  unfortunately 
the  zero-knowledge  property  will  no  longer  be  satisfied.  Beeause  the  graph  has  weighted  edges, 
when  the  prover  reveals  the  edges  of  the  eycle  the  verifier  will  learn  information  about  the  edge 
weights  assoeiated  with  the  cyele.  For  example,  if  every  edge  has  a  different  weight  in  the 
graph,  the  verifier  will  easily  be  able  to  identify  the  TSP  tour  that  is  supposed  to  remain  hidden. 
To  ereate  a  valid  zero-knowledge  proof  system  for  the  TSP,  we  must  transform  the  given 
problem  into  an  instanee  of  the  sub-graph  isomorphism  problem.  Given  a  weighted  graph 
G  =  (V,  E,  1/F)  for  the  TSP  {W  being  the  set  of  edge  weights  assoeiated  with  the  graph),  we 
eonstruct  a  new  graph  G'  as  follows.  For  every  edge  e  G  E(G')  with  I/K(e)  =  k,  we  replace  e 
with  a  path  with  k  edges  (both  endpoints  of  are  the  endpoints  of  e).  This  is  illustrated  in 
Figure  10.  The  problem  is  now  to  find  a  path  in  the  new  graph  of  length  equal  to  the  minimum 


Figure  10:  An  example  of  the  transformation  from  the  traveling  salesman 
problem  to  the  suh-graph  isomorphism  problem 
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TSP  weight  in  the  original  graph. 


3.1.7  Minimum  Bandwidth  Problem 

The  minimum  bandwidth  problem  (MBP)  is  stated  as  follows:  Given  a  graph  G  and  a  positive 
integer  K,  find  a  linear  arrangement  of  the  vertices  (a  bijective  numbering  /:  V (fi)  ^  {1, ,  n}) 
such  that  \f(u)  —  f(y)  |  <  K.  This  decision  problem  is  an  NP-complete  problem, 

and  the  associated  optimization  problem  (find  the  minimum  value  of  \f(M)  ~  f(v)  I) 

Consider  the  graph  shown.  We  want  an  ordering  of  the  vertices  A,  B,  and  C  such  that  the  bandwidth  of 
the  ordering  is  at  most  1 . 


In  this  example,  the  possible  orderings  are: 

1.  [A,B,C]  (y—o''\) 

Bandwidth  =  2 

2. 

[A,  C,  B] 

cAoA) 

Bandwidth  =  2 

3. 

[B,  C,  A] 

cCoAd 

Bandwidth  =  2 

4. 

[B,  A,  C] 

cyoo 

Bandwidth  =  1 

5. 

[C,  A,  B] 

0-00 

Bandwidth  =  1 

6. 

[C,  B,  A] 

cCcAo 

Bandwidth  =  2 

So  the  bandwidth  desired  is  obtained  by  both  ordering  4  and  ordering  5. 

Figure  11:  Minimum  bandwidth  problem  example 

is  an  NP-hard  problem  (Garey  and  Johnson  1979).  Figure  1 1  illustrates  an  example  of  the  MBP. 


3.1.7.1  Algorithms 

Very  few  algorithms  are  able  to  cope  with  the  MBP  efficiently.  Both  exact  and  approximate 
algorithms  exist  for  the  problem.  As  of  2008,  the  best  exact  algorithm  has  time  complexity 
0(5”/(|n|)),  where  /  is  a  polynomial  function  (Cygan  and  Pilipczuk  2008).  It  seems  to  be  an 
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open  problem  as  to  whether  the  problem  can  be  solved  in  0(2”/(|n|))  time,  where  /  is  a 
polynomial  function  (Woeginger  2003).  As  for  approximation  algorithms,  as  of  2003  the  best 
known  approximation  algorithm  has  an  0((logn)^^logn  loglogn)  approximation  ratio 
(Woeginger  2003).  In  2005,  a  hybrid  algorithm  was  presented  that  produces  either  an  ordering 
that  obtains,  in  time,  the  optimal  minimum  bandwidth  or,  in  polynomial  time,  an 

0((log^'^  n)(loglogn)(log^  loglogn))-approximation  (Vassilevska,  Williams  and  Woo  2006). 

A  set  of  useful  benchmark  instances  are  available  for  the  minimum  bandwidth  problem. 
The  Harwell-Boeing  Sparse  Matrix  Collection  (Duff  1992)  presents  many  instances  in  a  range  of 
sizes  that  originate  from  real-world  applications.  While  these  instances  are  not  generated  in  any 
uniform  manner,  there  are  several  classes  that  the  algorithms  all  seem  to  struggle  with.  For 
example,  the  “Cannes”  matrices,  with  instances  named  can_###,  stem  from  aircraft  design.  This 
class  of  instances  appears  to  be  difficult  for  many  algorithms  when  the  order  is  larger  than  200. 
When  the  order  is  greater  than  800,  as  in  can_838,  most  algorithms  are  unable  to  solve  it  exactly 
(Lim,  Rodrigues  and  Xiao  2006).  It  would  be  worth  investigating  what  makes  these  instances  so 
difficult  for  the  algorithms. 

3.1.7.2  Translation  to  Sub-graph  Isomorphism 

The  minimum  bandwidth  problem  can  be  viewed  as  a  subproblem  of  the  sub-graph  isomorphism 
problem.  Because  of  this  property,  any  zero-knowledge  proof  system  for  the  sub-graph 
isomorphism  problem  can  be  applied  to  the  minimum  bandwidth  problem.  Let  C  be  a  graph  with 
minimum  bandwidth  K.  Define  to  be  the  path  of  length  n  with  additional  edges  added 
between  every  pair  of  vertices  that  are  at  distance  at  most  K  apart  (on  the  original  path).  Then 
the  minimum  bandwidth  problem  can  be  restated  as  follows:  Given  a  graph  G  on  n  vertices,  find 
an  isomorphism  n:  V{G')  F(/f),  where  H  ^  P^_^.  The  discovered  isomorphism  from  G  to  a 

sub-graph  of  P^-i  will  then  give  a  linear  order  for  F(G)  with  bandwidth  at  most  K. 

For  the  example  illustrated  in  Figure  3-10,  we  consider  the  path  P2  •  this  case,  there 
exists  an  isomorphism  n  that  maps  F(G)  to  ^(^2  ).  One  possible  mapping  is  given  by: 

7r(i4)  =  2, 7r(B)  =  1,  and7r(G)  =  3  (10) 

Thus  the  isomorphism  n  gives  us  a  linear  ordering  identical  to  ordering  number  4  in  the  example. 
We  may  also  consider  the  problem  of  finding  a  linear  order  for  V (G)  with  bandwidth  at  most  2. 

In  this  case,  we  consider  the  path  P| .  It  is  clear  that  in  this  case,  any  of  the  orderings  illustrated 
can  be  mapped  isomorphically  to  a  sub-graph  of  P| . 

This  process  of  transforming  the  MBP  to  the  SGI  allows  us  to  employ  the  same  zero- 
knowledge  proof  system  for  the  MBP.  The  common  inputs  to  the  protocol  are  a  graph  G  and  the 
value  K  of  the  minimum  bandwidth  of  the  graph.  The  private  input  is  a  linear  ordering  of  the 
vertices  that  has  bandwidth  K.  The  prover  permutes  the  path  P^-i,  and  sends  a  commitment  to 
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this  to  the  verifier.  The  verifier  then  ehooses  randomly  whether  to  eheek  if  the  isomorphism  was 
constructed  correctly  or  if  there  is  an  isomorphic  copy  of  the  graph  G  in  Pn-i- 


3.1.8  Summary 

The  sub-graph  isomorphism  class  contains  many  problems  that  may  be  useful  as  base  problems 
for  zero-knowledge  proof  systems.  The  minimum  bandwidth  problem,  for  example,  appears  to 
be  a  difficult  problem  with  relatively  few  efficient  algorithms  to  solve  it.  The  same  is  true  of  the 
longest  path  problem.  The  Hamiltonian  cycle  problem  or  the  Hamiltonian  path  problem  may  be 
difficult  as  well,  however  the  longest  path  problem  intuitively  seems  harder.  The  Hamiltonian 
problems  require  that  all  vertices  be  members  of  the  required  cycle  or  path,  whereas  in  the 
longest  path  problem  a  solver  must  not  only  find  a  path  of  the  required  length  but  must  also 
determine  which  vertices  the  path  traverses. 

One  problem  that  will  almost  certainly  not  be  useful  in  creating  a  secure  protocol  is  the  graph 
isomorphism  problem.  The  current  algorithms  (Nauty  and  VF2,  for  example)  are  far  too 
efficient  at  solving  large  problem  instances.  In  order  to  create  a  secure  protocol  off  of  the  graph 
isomorphism  problem,  we  would  need  to  use  extremely  large  graphs  (over  10,000  nodes),  which 
then  dramatically  increases  the  amount  of  information  to  be  transferred  between  prover  and 
verifier. 
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3.2  GRAPH  COLORING  CLASS 


The  graph  coloring  class  of  problems  contains  three  important  problems:  graph  /c-colorability, 
graph  3-colorability,  and  equitable  3-colorability.  All  three  problems  are  NP-complete.  Graph 
3-colorability  is  proven  NP-complete  by  a  reduction  from  3-SAT  (a  well-known  NP-complete 
subproblem  of  the  satisfiability  problem),  which  then  proves  the  NP-completeness  of  graph  k- 
colorability.  Equitable  3-colorability  is  proven  NP-complete  by  a  reduction  from  graph  3- 
colorability  easily  by  adding  isolated  vertices  to  a  3 -colorable  graph  to  obtain  a  graph  that  can  be 
equitably  3 -colored.  The  subproblem  structure  of  the  graph  coloring  class  is  illustrated  in  Figure 
12. 


Figure  12:  The  graph  coloring  class 


3.2.1  Graph  Coloring  Problem 

The  graph  coloring  decision  problem  is  stated  as  follows:  Given  a  graph  G  and  a  positive  integer 
k,  is  it  possible  to  color  the  vertices  of  G  with  k  colors  so  that  every  edge  has  different  colored 
endpoints?  More  formally,  is  there  a  function  /:  V(G)  [/c]  such  that  if  xy  G  E{G'),  then 
/(x)  A  f(y)7  Another  alternative  is  to  view  the  graph  k  -coloring  problem  as  an  optimization- 
type  decision  problem.  This  formulation  of  the  problem  is  as  follows:  Given  a  graph  G, 
partition  the  vertices  into  k  sets  so  that  m,  the  number  of  edges  with  both  endpoints  in  the  same 
partition  class,  is  minimized.  Then  G  is  /c-colorable  if  and  only  if  the  minimum  value  obtained  is 
m  =  0.  The  associated  chromatic  number  problem  asks  for  the  minimum  number  of  colors 
needed  to  color  E(G)  so  that  if  xy  G  E(G)  then  /(x)  A  /(y).  The  solution  to  the  problem  is  the 
chromatic  number  of  the  graph  G,  and  is  denoted  by  y(G). 

Much  work  has  been  done  in  exploring  which  classes  of  graphs  have  polynomial-time 
optimal  coloring  algorithms.  For  example,  the  general  problem  can  be  solved  in  polynomial  time 
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for  any  comparability  graph  (an  undirected  graph  that  is  transitively  orientable)  and  any  ehordal 
graph  (a  graph  with  no  indueed  cycle  with  length  greater  than  3)  (Golumbic  1980).  We  also  note 
that  when  eonsidering  a  graph  with  maximum  degree  at  most  k,  the  deeision  problem  beeomes 
trivial  due  to  Brooks’  Theorem,  whieh  states  that  A(G)  >z(.G)  for  any  graph  G  that  is  neither  a 
eomplete  graph  nor  an  odd  eycle  (Diestel  2006).  For  the  class  of  random  graphs  G(n,  p),  there 
exist  linear-time  algorithms  for  optimal  coloring  when  p  >  1.01/n,  where  p  is  the  edge 
probability  assoeiated  with  the  random  graph  (Coja-Oghlan  and  Taraz  2004). 

3.2.1.1  Algorithms 

Much  work  has  been  done  on  developing  and  improving  efficient  algorithms  for  the  graph 
eoloring  problem.  While  there  is  an  abundanee  of  algorithms  foeused  on  achieving  near-optimal 
colorings  of  a  graph,  there  are  very  few  exaet  algorithms.  However,  the  near-optimal  algorithms 
can  in  many  cases  aehieve  colorings  of  a  large  number  of  graphs  that  use  the  minimum  possible 
number  of  colors.  The  algorithms  are  either  based  on  a  local  search  method,  sueh  as  tabu  seareh, 
or  on  a  braneh-and-bound-type  pruning  of  the  entire  seareh  spaee.  Some  of  the  most  commonly 
appearing  algorithms  in  the  literature  are  DSATUR,  Tabueol,  GH,  VNS  and  Amaeol. 

Instead  of  relying  on  a  loeal  seareh,  DSATUR  depends  on  a  speeific  ordering  of  the 
vertiees.  While  many  improvements  have  been  made  to  the  algorithm,  the  original  method 
colors  the  vertiees  aeeording  to  the  number  of  eolors  already  present  in  their  neighborhoods. 

The  DSATUR  algorithm  is  continuously  being  improved  upon  and  is  still  eompetitive  with  the 
eurrent  algorithms  (Brelaz  1979).  Other  similar  algorithms  based  on  specifie  vertex  orderings, 
sueh  as  RTF,  often  appear  as  a  piece  of  a  larger  algorithm  instead  of  as  a  standalone  method  like 
DSATUR. 

While  over  20  years  old,  Tabueol,  a  local  search  algorithm  based  on  tabu  seareh,  remains 
very  popular.  Tabueol  first  assigns  a  random  k-coloring  to  the  graph,  usually  with  a  significant 
number  of  eonflieting  edges  (edges  with  endpoints  of  the  same  color).  The  algorithm  then 
improves  this  coloring  until  it  has  reached  the  maximum  number  of  iterations  allowed  (Galinier 
and  Hertz  2006). 

The  Variable  Neighborhood  Seareh  algorithm  (VNS)  is  similar  to  Tabueol,  but  modifies 
the  searehing  method.  While  Tabueol  relies  on  tabu  seareh,  VNS  uses  several  neighborhoods  in 
order  to  avoid  getting  stuek  at  loeal  optima  (Avanthay,  Hertz  and  Zufferey  2003).  Variable 
Space  Seareh  (VSS)  is  an  improvement  of  VNS.  VSS  expands  upon  the  idea  of  considering 
many  neighborhoods  to  also  consider  multiple  objective  funetions  and  seareh  spaees  (Hertz, 
Plumettaz  and  Zufferey  2008). 

A  very  competitive  algorithm  is  GH,  a  hybrid  evolutionary  algorithm,  whieh  relies  upon 
a  loeal  seareh  method  and  a  crossover  function.  The  erossover  funetion  builds  a  new  solution 
from  two  previously  ereated  partial  solutions.  GH  is  quite  eompetitive  when  it  is  able  to 
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compute  an  answer  under  a  given  time  constraint,  however  there  are  many  instances  where  GH 
does  not  come  up  with  any  solution  (Galinier  and  Hao  1999). 

Perhaps  the  newest  algorithm  that  is  worth  considering  is  Amaeol.  Amacol  relies  on  a 
central  memory  solution  that  contains  pieces  of  solutions  and  is  continuously  updated.  Using 
what  is  currently  in  the  central  memory  solution,  Amaeol  runs  a  local  search  method  to  improve 
and  create  a  better  solution,  and  then  stores  pieces  of  this  new  solution  (Galinier,  Hertz  and 
Zufferey  2008). 

While  there  is  no  one  reference  that  runs  experiments  on  all  four  of  these  algorithms  side- 
by-side,  there  has  been  a  set  of  experiments  run  comparing  Tabucol,  DSATUR,  GH,  and  Amacol 
(Galinier,  Hertz  and  Zufferey  2008),  another  set  comparing  DSATUR,  Tabucol,  RLF,  and  VNS 
(Galinier  and  Hertz  2006),  and  yet  another  set  comparing  VSS  and  Tabucol  (Hertz,  Plumettaz 
and  Zufferey  2008). 

Almost  all  tests  run  used  a  specific  benchmark  sets  of  graphs,  such  as  the  graphs  from 
The  Second  DIMACS  Implementation  Challenge:  1992-1993  (Johnson  and  Trick,  Volume  26: 
DIMACS  Series  in  Discrete  Mathematics  and  Theoretical  Computer  Seience  1996),  that  are 
generally  considered  to  be  difficult  (meaning  that  most  algorithms  struggle  to  find  optimal 
solutions).  In  most  cases,  the  DIMACS  graphs  used  contain  either  500  or  1000  vertices.  Other 
classes  of  graphs  were  used  as  well,  such  as  the  flat  graphs  (each  containing  1000  vertices).  VSS 
was  able  to  obtain  an  optimal  eoloring  on  16  out  of  20  test  graphs  with  a  time  limit  of  one  hour. 
In  the  tests  that  did  not  produce  optimal  colorings,  VSS  produced  colorings  using  at  most  five 
extra  colors.  Similar  results  are  shown  for  Tabucol,  but  with  fewer  optimal  colorings. 

According  to  test  results  from  July  2008,  GH  outperforms  all  of  the  other  listed  algorithms, 
except  for  the  few  cases  where  it  is  unable  to  determine  a  result  under  the  time  constraint  (Hertz, 
Plumettaz  and  Zufferey  2008). 


3.2. 1.2  Existing  Zero-Knowledge  Proofs 

Several  zero-knowledge  proof  systems  have  been  developed  for  the  graph  3-colorng  problem 
(G3C).  All  proof  systems  take  as  public  input  a  graph  G  and  as  private  input  0,  a  3-coloring  of 
the  vertices  of  C .  In  addition  to  the  protocols  discussed  here,  there  are  a  few  variations  that  have 
been  published,  such  as  a  protocol  that  relies  upon  hiding  the  coloring  through  an  isomorphic 
graph  (Grigoriev  and  Shpilrain  2008)  and  non-interactive  zero-knowledge  proof  systems  (Blum, 
Feldman  and  Micali  1988),  (Kurosawa  and  Takai  1992). 

Figure  13  illustrates  ZKP6,  a  zero-knowledge  proof  system  for  G3C  (Goldreich,  Micali 
and  Wigderson  1991).  In  the  protocol,  the  prover  permutes  the  coloring  of  the  graph  and 
commits  to  it  before  sending  it  to  the  verifier.  The  verifier  then  seleets  an  edge  from  the  graph 
randomly  and  uniformly,  sends  the  edge  choice  to  the  prover,  and  then  asks  the  prover  to  verify 
that  the  edge’s  endpoints  have  distinct  colors. 
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A  similar  zero-knowledge  proof  system  is  ZKP7,  illustrated  in  Figure  14.  The  method  of 
proof  is  the  same,  however  it  is  run  in  parallel  instead  of  sequentially.  The  prover  ereates  t 
permutations  of  the  eoloring  to  eommit  to,  while  the  verifier  eommits  to  a  set  of  t  edges  to 
ehallenge  the  prover  with. 


Common  Input:  The  Petersen  Graph  (G)  of  order  10 

Private  Input:  A  3-coloring  0  of  P(G)  (as  shown) 


Figure  13:  ZKP6  for  the  graph  3-coloring  problem  example 
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3.2. 1.3  Discussion  of  Existing  Protocols 

If  the  prover  is  cheating  and  does  not  have  a  valid  3 -coloring  of  G  during  ZKP6,  then  when  the 
prover  attempts  to  3 -color  G,  the  coloring  will  have  at  least  one  edge  with  both  endpoints  colored 
the  same.  The  probability  that  the  verifier  will  choose  an  edge  that  the  prover  has  colored 
incorrectly  can  be  as  low  as  l/|£'(f7)  |.  This  implies  that  the  probability  that  the  prover  will  be 
discovered  as  a  cheater  is  as  low  as  1/|£'(G)  |,  and  hence  the  probability  that  a  cheater  will  not  be 
discovered  in  one  round  could  be  as  high  as  1  —  1/\E(G)  \ .  The  number  of  iterations  necessary 
to  achieve  a  good  confidence  level  in  this  protocol  can  therefore  be  extremely  high.  To  illustrate 
how  bad  this  probability  is,  on  a  graph  with  only  1 ,000  edges,  we  would  need  to  perform  4603 
iterations  of  the  protocol  in  order  to  achieve  a  99%  confidence  level. 


Approved  for  Public  Release;  Distribution  Unlimited. 
37 


In  ZKP7,  the  probability  of  catching  a  cheating  prover  increases  to  (1  —  1/|£'(G)|)^  in 
each  round.  When  we  take  t  =  2n|£'(G)|,  this  reduces  to  (1  —  1/|£'(G)|)^  <  e~^.  Thus  merely 
by  choosing  t  large  enough,  it  is  possible  to  achieve  any  desired  confidence  level  in  just  one 
round  (Goldreich  and  Kahan  1996). 

Now  we  compute  the  number  of  bits  to  be  transferred  during  the  two  protocols  discussed. 
In  ZKP6,  the  prover  does  not  need  to  send  an  adjacency  matrix.  Instead,  the  prover  sends  an  n- 
element  list  in  which  the  i'^  position  of  the  list  contains  the  color  of  vertex  i.  Since  there  are  3 
possible  colors,  each  entry  requires  at  most  2  bits  to  be  recognized.  The  total  number  of  bits 
needed  to  transmit  the  coloring  will  thus  be  at  most  2n  (not  including  commitment).  In  step  5, 
the  verifier  needs  to  transmit  the  two  vertices  that  identify  each  edge  selected.  If  there  are  n 
vertices,  then  to  represent  a  vertex  the  verifier  will  need  at  most  log2  n  bits.  Since  two  vertices 
must  be  sent,  the  verifier  will  transmit  at  most  2  •  log2  n  bits.  Thus  not  including  what  is  needed 
for  commitment,  the  total  number  of  bits  sent  will  be: 

2n  +  2  •  log2  n  (12) 

If  the  maximum  amount  of  information  that  can  be  transmitted  is  10  kilobits,  then  we  must  have: 

2n  +  2  •  log2  n  <  10000  (13) 

n  <  4987  (14) 

Hence  the  largest  graph  that  could  be  considered  would  need  to  have  at  most  4987  vertices. 

It  is  clear  that  in  ZKP7,  the  amount  of  information  needed  to  be  transferred  increases 
dramatically  from  ZKP6.  In  ZKP7,  again  the  prover  will  send  an  n-element  list  such  that  the  i* 
position  of  the  list  contains  the  color  of  vertex  i.  Since  there  are  3  possible  colors,  each  entry 
needs  at  most  2  bits,  and  so  again  (like  in  ZKP6),  the  total  number  of  bits  to  transmit  the  coloring 
will  be  at  most  2n  (not  including  what  is  needed  for  commitments).  However  in  this  case  there 
are  t  different  colorings  being  sent,  and  so  the  total  number  of  bits  needed  is  at  most  2tn.  In  step 
5,  the  prover  must  transmit  the  two  vertices  that  identify  each  edge  selected,  as  before,  however 
the  verifier  must  transmit  a  list  of  t  edges,  requiring  that  the  verifier  transmit  2t  •  log2  n  bits. 

Not  including  what  is  needed  for  commitments,  the  total  number  of  bits  sent  during  ZKP2  will 
be: 

2tn  +  2t  •  log2  n  (15) 

If  the  maximum  amount  of  information  to  be  transmitted  is  10  kilobits,  then  we  must  have: 

2tn  +  2t  •  log2  n  <  10000  (16) 

In  any  graph,  |£'(G)  |  <  Q).  If  we  assume,  as  in  the  original  publication  of  ZKP2 
(Goldreich  and  Kahan  1996),  that  t  =  2n|£'(G)  |,  then  we  must  have  that  the  above  inequality 
simplifies  to  n  <  10.  This  tells  us  that  the  graphs  that  we  should  be  considering  can  have  at 
most  10  vertices,  which  is  not  likely  to  be  a  very  difficult  graph  coloring  instance. 
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Also,  if  we  desire  a  99.99%  probability  of  catehing  a  eheating  prover  in  one  round,  then 
we  must  have  (when  t  =  2n|£'(G)|): 

e-^  <  0.0001  (17) 

n  >  9.21  (18) 

Thus  on  problem  instances  with  exactly  10  vertices,  we  can  achieve  the  desired 
confidence  level  in  one  round  while  remaining  under  the  upper  limit  of  the  number  of  bits  to  be 
transmitted.  However,  these  instances  will  be  solvable  quickly  and  so  will  not  be  of  use  in 
creating  a  secure  protocol. 


3.2.2  Equitable  Coloring  Problem 

The  equitable  coloring  problem  (ESC)  is  formally  stated  as  follows:  Given  a  graph  G,  color  the 
vertices  of  G  with  as  few  colors  as  possible  such  that  any  two  color  classes  differ  in  size  by  at 
most  1 .  This  problem  is  NP-complete,  and  the  proof  of  this  is  fairly  a  straightforward  reduction 
from  graph  coloring. 

Note  that  any  equitable  coloring  algorithm  is  also  a  general  graph  coloring  algorithm,  and 
hence  the  ESC  must  be  at  least  as  hard  as  the  GSC  in  terms  of  algorithms  finding  optimal 
solutions.  Because  of  this,  there  are  no  algorithms  to  be  presented  here  that  were  not  previously 
discussed  in  the  general  graph  coloring  section. 


3.2.2.1  Application  to  Zero-Knowledge  Proofs 

Note  that  either  of  the  previously  discussed  protocols  could  be  applied  to  the  ESC,  as  the  ESC  is 
a  subproblem  of  the  GSC.  In  the  zero-knowledge  proof  systems  discussed  for  the  GSC,  ZKP6 
has  a  low  probability  of  catching  a  cheating  prover,  while  ZKP7  has  a  high  amount  of 
information  to  be  transmitted.  To  address  these  problems,  we  turn  to  the  ESC. 

A  zero-knowledge  proof  system,  ZKP8,  is  illustrated  in  Eigure  15.  The  protocol  is 
similar  to  that  of  the  independent  set  problem,  and  takes  as  public  input  a  graph  G  and  as  private 
input  0,  an  equitable  S-coloring  of  G.  The  graph  must  be  committed  to  as  a  permuted  adjacency 
matrix  to  hide  the  locations  of  the  vertices  in  each  color  class,  and  the  coloring  must  also  be 
committed  to  (in  a  list  format).  The  verifier  will  choose  to  either  check  that  the  permutation  was 
performed  correctly  or  to  check  that  a  specified  color  class  induces  an  independent  set  in  the 
graph.  If  the  prover  has  an  invalid  3 -coloring,  then  when  the  verifier  requests  a  color  class  at 
least  one  of  the  three  color  classes  will  not  be  independent.  Thus  the  probability  that  the  verifier 
will  catch  a  cheating  prover  will  increase  to  1/6. 
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Common  Input: 


The  Petersen  Graph  (G)  of  order  10 


Private  Input: 


An  equitable  3 -coloring  0  of  P(G)  (as  shown) 


Figure  15:  ZKP8  for  the  equitable  3-coloring  problem  example 


It  is  important  to  note  that  ZKP8  is  only  a  valid  zero-knowledge  proof  system  for  the 
ESC,  not  for  the  G3C.  If  we  consider  ZKP8  as  applied  to  the  G3C,  the  prover  is  showing  not 
only  that  one  of  the  color  classes  is  an  independent  set,  but  also  the  size,  k,  of  the  requested  color 
class.  The  transmission  of  the  size  of  one  color  class  from  the  prover  to  the  verifier  prevents  the 
protocol  from  being  a  zero-knowledge  proof  system.  There  is  no  way  that  a  verifier  could  have 
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determined  alone  that  one  of  the  eolor  elasses  has  size  k,  and  so  the  proof  of  the  zero-knowledge 
property  using  simulators  would  fail. 

We  must  consider  now  whether  ZKP8  is  more  efficient  than  either  ZKP6  or  ZKP7.  First, 
we  calculate  the  total  number  of  bits  that  must  be  transferred.  Since  the  graphs  we  are 
considering  in  this  example  are  simple,  undirected  graphs,  the  adjacency  matrices  will  be 
symmetric  with  zeros  along  the  diagonal  and  with  all  entries  either  0  or  1 .  Thus  the  prover  only 
needs  to  transmit  (”)  entries  of  A  to  the  verifier.  The  transmission  of  the  coloring  needs  2n  bits. 
Hence  step  4  requires  the  transmission  of  (”)  +  2n  committed  entries,  each  of  which  is  one  bit. 
In  step  6,  the  verifier  sends  one  bit.  If  c  =  0,  the  prover  must  send  the  isomorphism  n.  We  can 
send  this  in  list  form,  and  so  we  will  need  n  log2  n  bits.  If  c  =  1,  the  verifier  must  send  an 
identifier  for  a  color  class.  Since  there  are  three  different  color  classes,  this  will  require  2  bits. 
Then  the  prover  must  also  send  the  decommitment  information  for  the  entries  corresponding  to 
edges  within  the  specified  color  class. 

Not  including  what  is  needed  for  commitments,  the  total  number  of  bits  sent  will  be: 

Q)  +  2n -f  1 -f  n  log2  n  (19) 


If  the  maximum  amount  of  information  to  be  transmitted  is  10  kilobits,  then  we  must  have: 

g)  +  2n  +  1  +  n  log2  n  <  10000  (20) 

n  <  133  (21) 

The  largest  graph  to  be  considered  would  need  to  have  at  most  133  vertices. 

The  protocol  is  more  efficient  than  ZKP6  in  terms  of  the  number  of  rounds  necessary  to 
achieve  an  adequate  confidence  level.  ZKP8  requires  38  iterations  to  achieve  a  99%  chance  of 
catching  a  cheating  prover  (recall  that  ZKP6  required  4603  iterations).  We  also  note  that  if  it  is 
possible  to  use  as  common  input  graphs  in  which  it  is  difficult  to  produce  even  two  independent 
color  classes,  the  chance  that  the  verifier  can  catch  a  cheating  prover  increases  from  1/6  to  1/3. 

Thus  ZKP8  is  a  more  efficient  protocol  for  E3C  than  ZKP7,  and  also  gives  a  better 
probability  of  catching  a  cheating  prover  than  ZKP6  (and  ZKP7  depending  on  what  value  of  t  is 
chosen).  All  that  remains  is  to  prove  that  ZKP8  is  in  fact  a  valid  zero-knowledge  proof  system. 


Claim:  ZKP8  is  a  zero-knowledge  proof  system. 

Proof: 

Completeness:  If  the  prover  has  a  yes-instance  x  of  E3C,  then  the  verifier  will  accept  x 
with  probability  1 . 

Soundness:  If  the  prover  has  a  no-instance  y  of  E3C,  the  prover  will  be  caught  only  if  the 
verifier  chooses  c  —  1,  and  if  the  verifier  selects  a  color  class  that  is  not  independent. 
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Since  c  is  chosen  uniformly  and  randomly  by  the  verifier,  the  probability  that  the  verifier 
will  reject  y  is  1/6  in  each  round.  This  implies  that  the  probability  that  the  verifier  does 

not  reject  y  after  c  rounds  is  at  most  .  When  we  repeat  the  protocol  for  6k  rounds, 

©6k 

,  which  is  asymptotically  close  to 

(and  never  exceeding)  2“^. 


Zero-Knowledge  Property:  Suppose  the  verifier  is  attempting  to  extract  useful 
information  from  his  conversation  with  the  prover.  Then  the  verifier  can,  in  the  same 
manner,  extract  the  information  even  without  the  aid  of  the  prover.  In  each  round  he  does 
the  following: 


Begin. 

Verifier  simulates  the  prover.  The  verifier  flips  a  fair  coin  and,  according  to  the 
outcome  of  the  coin,  commits  to  either  the  graph  G  or  an  arbitrary  3-partition  of  n 
vertices  in  which  each  partition  class  is  an  independent  set.  G  is  committed  to  in 
the  same  way  the  prover  would  have  done  so.  The  partition  is  committed  to  in 
just  the  way  the  prover  would  have  committed  to  such  a  partition  in  G.  Then, 
acting  as  prover,  he  presents  the  committed  information  to  the  verifier.  Now  he 
takes  the  other  side. 


Verifier  simulates  the  verifier.  The  verifier  guesses  randomly  and  uniformly 
whether  to  request  a  graph  or  a  partition.  Because  the  verifier  has  no  way  to 
guess  with  any  advantage  whether  the  committed  matrix  contains  a  graph  or  a 
partition  (because  the  choice  is  random),  there  is  a  50%  chance  that  he  requests  an 
option  (graph  or  partition)  that  the  verifier,  in  the  guise  of  prover,  can  supply  (in 
all  cases).  If  a  partition  was  requested  but  a  graph  had  been  committed  to,  then 
the  verifier  guess  randomly  and  uniformly  which  color  class  to  request.  Then 
there  is  a  67%  chance  that  the  verifier,  in  the  guise  of  prover,  can  supply  what  was 
requested  correctly.  This  gives  a  total  chance  of  83%  that  the  verifier,  in  the  guise 
of  the  prover,  can  supply  what  is  requested.  If  what  is  requested  cannot  be 
supplied,  the  verifier  backs  up  the  simulation  to  the  state  it  was  in  at  the  start  of 
this  round  and  restarts  the  entire  round  (verifier  simulating  the  prover). 

End. 

In  an  expected  6  passes  through  each  round,  the  verifier  will  obtain  the  information 

without  the  help  of  the  prover.  Thus  the  interaction  does  not  help  the  verifier  do 
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something  with  the  prover  in  expeeted  polynomial  time  that  he  eould  not  as  well  have 
done  without  the  prover  in  expeeted  polynomial  time.  ■ 

3.2.3  Summary 

The  graph  coloring  problem  and  equitable  coloring  problem  have  positive  and  negative  attributes 
in  terms  of  zero-knowledge  proof  systems.  A  positive  feature  of  these  problems  is  the  difficulty 
level.  There  exist  difficult  instances  of  the  problems,  and  methods  have  been  published  on  how 
to  create  difficult  instances.  This  would  provide  a  strong  foundation  for  a  zero-knowledge  proof 

system.  The  negative  aspects  of  the  coloring  problems  are  the  soundness  probabilities  of  the 

1 

proof  systems.  Compared  to  the  soundness  probability  of  -  that  we  see  in  the  sub-graph 

isomorphism  problem  and  sub-problems,  equitable  coloring  is  able  to  achieve  only  a  soundness 
1 

probability  of-.  This  means  increasing  the  number  of  rounds  from  7  to  38  in  order  to  achieve  a 

99%  probability  of  catching  a  cheating  prover.  Given  the  scenarios  in  which  we  are  looking  to 
employ  zero-knowledge  proof  systems,  it  is  not  realistic  to  expect  that  38  rounds  of  one  protocol 
will  be  possible.  In  order  to  utilize  graph  coloring  or  equitable  coloring,  we  first  need  to  develop 
a  better  zero-knowledge  proof  system. 


3.3  OTHER  NP-COMPLETE  PROBLEMS 
3.3.1  Satisfiability 

The  satisfiability  problem  (SAT)  was  the  first  problem  to  be  proven  NP-complete  (Garey  and 
Johnson  1979).  The  problem  falls  under  the  category  of  propositional  logic,  and  is  stated  as 
follows:  Given  a  set  of  Boolean  variables  and  a  collection  of  clauses  over  the  set  of  variables,  is 
there  a  truth  assignment  for  the  variables  such  that  every  clause  in  the  collection  is  satisfied? 


3.3.1.1  Algorithms 

Much  work  has  been  done  on  developing  algorithms  to  quickly  and  efficiently  solve 
instances  of  the  SAT  problem.  The  algorithms  fall  into  two  distinct  categories:  complete 
algorithms  and  incomplete  algorithms.  Incomplete  algorithms  are  stochastic  local  search  based, 
and  are  often  faster,  however  fail  to  prove  when  an  instance  of  SAT  is  unsatisfiable.  Some  well- 
known  incomplete  algorithms  are  WalkSAT  and  GSAT.  Complete  algorithms  are  systematic 
search  algorithms  and  usually  run  slower,  but  are  able  to  determine  when  an  instance  of  SAT  is 
unsatisfiable.  Some  complete  algorithms  that  are  used  often  are  DPLL,  SATO,  and  GRASP. 

One  of  the  first  algorithms  published  was  the  Davis-Putnam-Logemann-Loveland 
(DPLL)  algorithm.  The  algorithm  is  still  favored  today,  and  many  newer  algorithms  such  as 
GRASP  (Marques-Silva  and  Sakallah  1999),  SATO  (Zhang  1997),  and  BerkMin  (Goldberg  and 
Novikov  2002)  were  created  with  the  same  basic  idea  but  with  some  modifications  and 
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improvements.  The  main  ideas  of  the  systematic  search  algorithms  are  backtracking  and  pruning 
the  search  space. 

The  incomplete  algorithms  available  are  also  quite  efficient  in  satisfiable  instances. 

When  comparing  the  popular  algorithms  WalkSAT  and  GSAT,  it  appears  that  neither  algorithm 
is  able  to  outperform  the  other  consistently  (Hoos  and  Stutzle  2000).  A  new  variant  of 
WalkSAT,  gNovelty^,  has  been  developed  recently  for  the  annual  SAT  competition,  and  appears 
to  perform  well  in  the  random  SAT  area  (Jia  2007). 

A  valuable  resource  for  determining  runtime  performance  of  the  most  up-to-date  SAT 
solvers  is  the  annual  International  SAT  Competition"^.  Considering  the  results  of  the  2008  SAT 
Competition,  the  solvers  were  given  100  instances  of  SAT,  some  of  which  were  unsatisfiable  (so 
only  complete  algorithms  competed).  The  solvers  were  allowed  900  seconds  (15  minutes)  to 
solve  each  instance  (or  determine  the  instance  unsatisfiable)  before  timing  out.  The  instances 
consisted  of  anywhere  from  286  to  1 1,483,525  variables  and  from  1742  to  32,697,150  clauses. 
The  instances  were  taken  from  several  benchmark  suites,  as  well  as  instances  from  past  SAT 
Competitions  (which  includes  random  instances).  The  first  place  winner,  MiniSat  2.1,  was  able 
to  solve  81  out  of  the  100  instances  correctly,  and  the  top  four  winners  all  solved  more  than  75 
out  of  the  100  instances  correctly. 


3.3. 1.2  Existing  Zero-Knowledge  Proofs 

There  seems  to  be  less  work  done  on  the  SAT  problem  with  regard  to  zero-knowledge  proof 
systems  than  for  some  other  NP-complete  problems  like  graph  3-colorability  or  sub-graph 
isomorphism.  A  zero-knowledge  proof  system  for  the  SAT  problem,  ZKP9,  which  is  illustrated 
in  Figure  16,  takes  as  common  input  a  set  U  of  Boolean  variables  and  a  collection  C  of  clauses 
and  as  private  input  a  set  of  true/false  assignments  for  the  variables  in  U. 

The  prover  (P)  constructs  the  circuit  of  truth  tables  that  corresponds  to  the  instance  of  the 
problem.  P  then  randomly  permutes  the  rows  of  each  truth  table,  and  randomly  complements  the 
columns  of  the  tables,  except  for  the  last  column  of  the  last  table.  P  then  sends  a  commitment  to 
the  permuted  and  complemented  set  of  tables  to  the  verifier  (V).  V  chooses  a  random  bit  c  and 
sends  c  to  P.  If  V  sends  c  =  0  to  P,  then  P  sends  the  decommitment  information  for  all  of  the 
truth  tables  to  V.  P  also  notifies  V  as  to  which  columns  were  complemented.  V  then  checks  that 
the  truth  tables  were  constructed  correctly.  If  V  sends  c  =  1  to  P,  then  P  sends  to  V  the 
decommitment  information  for  only  the  rows  that  correspond  to  a  satisfying  truth  assignment 
before  complementing  took  place.  V  then  checks  that  these  rows  lead  to  a  final  output  of  true 
(and  hence  explains  why  the  last  column  of  the  final  truth  table  cannot  be  complemented) 
(Brassard,  Chaum  and  Crepeau  1988). 


"^Available  at:  http://www.satcompetition.org/ 
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Few  other  zero-knowledge  proof  systems  have  been  published  for  the  SAT  problem. 
Papers  have  been  published  on  non-interaetive  zero-knowledge  proof  systems  (Damgard  1992), 
zero-knowledge  proof  systems  with  two  provers  (Dwork,  et  al.  1992),  and  an  interaetive  zero- 
knowledge  proof  system  that  focuses  on  a  more  secure  commitment  method  than  in  the  protocol 
presented  here  (Brassard  and  Crepeau  1986).  While  these  other  protocols  may  vary  slightly  from 
the  one  illustrated  in  this  report,  the  extent  to  which  the  SAT  problem  has  been  studied  prevents 
the  problem  from  being  a  secure  base  problem  in  a  protocol. 
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Common  Input:  A  set  U  of  Boolean  variables  and  a  collection  C 
of  clauses. 

Private  Input:  A  set  of  assignments  for  U  such  that  every  clause 
in  C  is  satisfied. 


Example  Instance: 

U  =  {  a,  b,  c  } 

C  =  { (a  v  ^b)  A  (“'a  v  c)  } 


Prover 

1 .  Randomly  permutes  the  rows  of  the  truth  tables. 

2.  Randomly  complements  the  columns  of  the 
truth  tables  except  for  the  last  column  in  the 
final  table.  (Shown  in  green) 

3.  Sends  a  commitment  to  the  scrambled  tables  to 
the  verifier. 


Verifier 


6. 


c  =  0 


6. 


c 


1 


Sends  the  decommitment 
information  for  all  truth  table 
entries  to  the  verifier  and 
identifies  which  columns  were 
complemented. 


4. 

5. 


7. 


Chooses  a  random  bit  c. 
Sends  c  to  the  prover. 


Checks  that  the  truth  tables  were  formed 
correctly. 


Sends  the  decommitment 
information  for  the  row  in  each 
truth  table  that  corresponds  to 
the  satisfying  truth  assignment. 


7.  Checks  that  the  sequence  of  rows  revealed 
outputs  true. 


Figure  16:  ZKP9  for  the  satisfiability  problem  example 


3.3. 1.3  Discussion  of  Existing  Protocols 

In  discussing  the  amount  of  information  transferred  in  the  protocol,  we  will  consider  an  instance 
of  3-SAT,  as  any  SAT  instance  can  be  transformed  to  a  3-SAT  instance.  In  an  instance  of  3- 
SAT,  each  clause  has  three  variables.  The  truth  table  for  each  clause  will  have  2^  =  8  rows  and 
4  columns,  giving  a  total  of  32  entries.  Since  each  entry  is  either  true  or  false,  we  need  only  32 
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bits  to  send  the  entries  for  each  clause.  If  there  are  |C|  clauses  total,  then  in  step  3  the  prover 
needs  32|C|  bits  total  to  send  the  truth  tables  (not  including  the  bits  needed  for  the  commitment 
process).  In  step  5,  the  verifier  sends  I  bit.  If  c  =  0,  then  the  prover  must  send  the  identifiers 
for  each  column  in  each  truth  table  that  is  complemented.  Since  each  truth  table  has  4  columns, 
at  most  4|C|  —  1  columns  can  be  complemented.  To  send  a  list  of  numbers  representing  the 
columns  that  are  complemented,  the  prover  must  transfer  (4|C|  —  1)  log2(4|C|  —  1)  bits.  If 
c  =  1,  then  the  prover  must  reveal  one  row  from  each  truth  table  by  sending  the  appropriate 
decommitment  information. 

When  considering  the  maximum  number  of  bits  that  will  be  necessary  in  the  zero- 
knowledge  proof  system  illustrated  (not  including  what  is  needed  for  commitment),  the  number 
transferred  will  be: 

32|C|  +  1  +  (4|C|  -  1)  log2(4|C|  -  1)  (22) 

If  the  maximum  amount  of  information  to  be  transmitted  is  10  kilobits,  then  we  must  have: 

32|C|  +  1  +  (4|C|  -  1)  log2(4|C|  -  1)  <  10000  (23) 

|C|  <  145  (24) 

Thus  the  largest  instance  of  3  SAT  that  could  be  considered  would  have  at  most  145 
clauses.  Considering  the  efficiency  of  the  SAT  competition  solvers,  the  instances  that  would  be 
allowed  under  this  information  restriction  would  not  create  secure  protocols. 


3.3.2  Graph  Partitioning  Problem 

The  graph  partitioning  problem  (GPP)  is  stated  as  follows:  Given  a  graph  G  and  positive 
integers  k  and  M,  is  there  a  partition  of  the  vertices  into  k  equal-sized  classes  so  that  there  are  at 
most  M  edges  with  endpoints  in  different  partition  classes?  In  general,  the  GPP  can  consider 
both  weighted  and  unweighted  graphs.  The  GPP  is  an  NP-complete  problem  in  both  the  general 
case  (allowing  weighted  vertices  and  edges)  and  in  the  case  restricted  to  unweighted  graphs 
(Garey  and  Johnson  1979). 


3.3.2.1  Algorithms 

There  are  several  algorithms  created  to  solve  the  GPP.  One  of  the  most  well-known  algorithms 
was  developed  in  the  1970’s  and  is  the  Kernighan-Lin  (KL)  algorithm  (Kernighan  and  Lin 
1970).  The  KL  algorithm  begins  with  an  initial  partition,  and  then  improves  it  by  swapping 
vertices  between  the  partition  classes.  This  will  method  will  clearly  find  terminate  with  a  local 
minimum.  However,  by  allowing  swaps  of  multiple  vertices  at  a  time,  the  algorithm  is  able  to 
avoid  getting  trapped  at  a  local  minimum,  and  so  it  is  able  to  get  closer  to  obtaining  a  partition 
that  will  achieve  global  minimum. 
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Another  useful  algorithm  is  JOSTLE,  a  multilevel  paradigm  algorithm.  JOSTLE  and 
other  multilevel  paradigm  algorithms  group  the  graph’s  vertices  to  make  clusters  that  then 
become  vertices  in  a  new  graph.  This  can  be  done  by  contracting  edges.  The  process  is  repeated 
until  a  smaller  graph  is  obtained,  and  then  existing  exact  GPP  algorithms  are  applied  to  the  new 
graph.  By  expanding  and  refining  the  partition  of  the  smaller  graph,  this  algorithm  works 
backwards  to  create  a  partition  for  the  original  graph  (Banos,  et  al.  2003). 

There  are  many  other  algorithms  for  solving  the  GPP,  included  an  isoperimetric 
algorithm  (Grady  and  Schwartz  2006),  a  lock-gain  based  algorithm  (Kim  and  Moon  2004), 
greedy  algorithms,  evolutionary  search  methods,  genetic  algorithms  (Bui  and  Moon  1996), 
simulated  annealing  algorithms  (Johnson,  Aragon,  et  al.  1989),  and  tabu  search  methods.  As  of 
2007,  JOSTLE  appears  to  be  the  best  performing  algorithm  for  the  GPP  (Loureiro  and  Amaral 
2007). 

Chris  Walshaw,  of  the  University  of  Greenwich,  maintains  “The  Graph  Partitioning 
Archive^”.  The  archive  consists  of  a  set  of  benchmark  problems,  most  of  which  are  obtained 
from  real-world  applications.  Most  of  the  recent  publications  compare  algorithms  based  on  the 
instances  provided  there.  Considering  these  instances  and  more  from  other  sources,  it  appears 
that  there  are  some  difficult  cases  of  the  GPP.  Some  of  these  instances  as  of  2005  were  taking 
over  4  hours  to  compute  (Eelner  2005). 


3.3.2.2  Establishing  a  Protocol 

At  first  glance,  it  appears  to  be  a  simple  matter  to  create  a  zero-knowledge  proof  system  for  the 
general  version  of  the  GPP  (on  weighted  graphs).  There  are  three  things  that  must  be  proven  to 
the  verifier:  (1)  the  partition  is  valid  (every  vertex  is  in  one  and  only  one  partition  class),  (2) 
every  partition  class  contains  exactly  k  vertices,  and  (3)  there  are  M  edges  between  the  partition 
classes. 

Considering  these  requirements,  we  arrive  at  Protocol  A,  illustrated  in  Eigure  17. 
Protocol  A  requires  the  prover  to  send  a  commitment  to  a  permuted  adjacency  matrix  for  the 
graph,  and  then  to  prove,  at  the  request  of  the  verifier,  either  that  the  permutation  was  performed 
correctly  or  that  there  exists  a  partition  of  the  vertices  that  obtains  the  required  cut  cost. 


^  Available  at:  http://staffweb.cms.gre.ac.uk/~wc06/partition/ 
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Common  Input:  A  graph  G  (shown),  the  number  of  desired 
partitions  (fc  =  2),  and  the  cost  of  the  desired  cut 
(c  =  2). 

Private  Input:  The  partition  {{A,C],{B,D])  that  achieves  the 
desired  cut  cost  (as  shown). 


1  1 


Prover 


Verifier 


1 .  Creates  a  permutation  tt  of  P (G). 

2.  Creates  an  adjacency  matrix  A  for  n(G). 

3 .  Sends  a  commitment  to  A  to  the  verifier. 

0  0  2  1 

0  0  12 

2  10  0 

12  0  0 


(commitment) 


A  = 


A  = 


0  0  2  1 
0  0  12 
2  10  0 
LI  2  0  0 

4.  Chooses  a  random  bit  c. 

5.  Sends  c  to  the  prover. 


c  =  0 


6.  Sends  n  and  the  decommitment 
information  for  A  to  the  verifier. 


7.  Checks  that  A  was  formed  correctly. 


c  =  1 


6.  Sends  the  decommitment 

information  for  the  cut  edges  to 
the  verifier. 

7.  Sends  the  decommitment 

information  for  the  non-edges 
between  partition  classes  to  the 
verifier. 


A  = 


0  0  2  1 
0  0  12 
2  10  0 
12  0  0 


Checks  that  the  total  sum  of  edges  shown 
is  equal  to  twice  the  desired  cut  cost. 
Checks  that  the  partition  classes  have 
equal  size  and  that  there  is  the  correct 
number  of  classes. 


A  = 


0<^  2  !]◄ — 
2  ITb^:. 

12  0  oJ"^ 


There  are  2  nodes  in  one  class 
There  are  2  nodes  in  the  other 
class 

The  two  classes  are  distinct  and 
disjoint 


Figure  17:  Protocol  A  for  the  graph  partitioning  prohlem  example 


While  Protocol  A  satisfies  the  completeness  and  the  soundness  properties  that  are 
required  for  any  interactive  proof  system,  it  does  not  satisfy  the  zero-knowledge  property  and 
hence  is  not  able  to  qualify  as  a  zero-knowledge  proof  system.  In  steps  6  and  7,  the  prover  opens 
all  entries  that  correspond  to  edges  between  partition  classes.  This  then  tells  the  verifier  how 
many  edges  there  are  between  the  partition  classes,  and  what  the  different  weights  are  (but  not 
which  vertices  the  edges  are  between).  This  is  information  that  the  verifier  could  not  possibly 
have  determine  alone  without  the  help  of  the  prover. 
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A  modification  to  Protocol  A  is  shown  in  Protocol  B.  Using  the  same  protoeol  but 
requiring  that  the  common  input  is  a  graph  with  all  edge  weights  equal  to  1 ,  the  prover  is  not 
revealing  any  new  information.  Beeause  the  cost  of  the  eut  is  common  knowledge,  the  verifier 
already  knows  how  many  edges  the  graph  has  between  partition  classes  (provided  that  all  edge- 
weights  are  equal  to  1).  Protocol  B  is  illustrated  in  Figure  18.  The  modifieation  from  weighted 
to  unweighted  graph  now  allows  the  protocol  to  be  a  zero-knowledge  proof  system. 


Common  Input:  An  unweighted  graph  G  (shown),  the  number  of 
desired  partitions  {k  —  2),  and  the  cost  of  the 
desired  cut  (c  =  2). 

Private  Input:  The  partition  that  achieves  the 

desired  cut  cost  (as  shown). 


1  1 


Prover 


Verifier 


1 .  Creates  a  permutation  tt  of  P(G) . 

2.  Creates  an  adjacency  matrix  A  for  n(G). 

3.  Sends  a  commitment  to  A  to  the  verifier. 

0  0  11 
0  0  11 
110  0 
110  0 


A  = 


A  = 


(commitment) 
0  0  11 
0  0  11 
110  0 
Ll  1  0  OJ 

Chooses  a  random  bit  c. 
Sends  c  to  the  prover. 


c  =  0 


6.  Sends  n  and  the  decommitmei 
information  for  A  to  the  verifier. 


Checks  that  A  was  formed  correctly. 


c  =  1 


6.  Sends  the  decommitment 

information  for  the  cut  edges  to 
the  verifier. 

7.  Sends  the  decommitment 

information  for  the  non-edges 
between  partition  classes  to  the 
verifier. 


0 

0 

1 

1 

0 

0 

1 

1 

1 

1 

0 

0 

1 

1 

0 

0 

A  = 


A  = 


Checks  that  the  total  sum  of  edges  shown 
is  equal  to  twice  the  desired  cut  cost. 
Checks  that  the  partition  classes  have 
equal  size  and  that  there  is  the  correct 
number  of  classes. 


■  There  are  2  nodes  in  one  class 
"There  are  2  nodes  in  the  other 

class 

■  The  two  classes  are  distinct  and 
disjoint 


[0, 

1 

1 

0, 

1 

1 

1 

1 

0 

Figure  18:  Protocol  B  for  the  graph  partitioning  problem  example 
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Claim:  Protocol  B  is  a  zero-knowledge  proof  system  for  the  GPP. 

Proof: 


Completeness:  If  the  prover  has  a  yes-instance  x  of  GPP,  then  the  verifier  will  accept  x 
with  probability  1 . 


Soundness:  If  the  prover  has  a  no-instance  y  of  GPP,  the  prover  will  be  eaught  only 
when  the  verifier  chooses  c  =  1.  Since  c  is  chosen  uniformly  and  randomly  by  the 
verifier,  the  probability  that  the  verifier  will  rejeet  y  is  1/2  in  each  round.  This  implies 

that  the  probability  that  the  verifier  does  not  reject  y  after  c  rounds  is  at  most  =2“^. 

Zero-Knowledge  Property:  Suppose  the  verifier  is  attempting  to  extract  useful 
information  from  his  eonversation  with  the  prover.  Then  the  verifier  can,  in  the  same 
manner,  extract  the  information  even  without  the  aid  of  the  prover.  In  eaeh  round  he  does 
the  following: 


Begin. 

Verifier  simulates  the  prover.  The  verifier  flips  a  fair  coin  and,  according  to  the 
outcome  of  the  coin,  commits  to  either  the  graph  G  or  an  arbitrary  /c -partition  of  n 
vertices  with  the  eorrect  total  cut  cost.  G  is  committed  to  in  the  same  way  the 
prover  would  have  done  so.  The  partition  is  committed  to  in  just  the  way  the 
prover  would  have  committed  to  such  a  partition  in  G .  Then,  aeting  as  prover,  he 
presents  the  eommitment  information  to  the  verifier.  Now  he  takes  the  other  side. 


Verifier  simulates  the  verifier.  The  verifier  guesses  randomly  and  uniformly 
whether  to  request  a  graph  or  a  partition.  Because  the  verifier  has  no  way  to 
guess  with  any  advantage  whether  the  committed  matrix  eontains  a  graph  or  a 
partition  (beeause  the  ehoice  is  random),  there  is  a  50%  ehance  that  he  requests  an 
option  (graph  or  partition)  that  the  verifier,  in  the  guise  of  prover,  ean  supply.  If 
not,  the  verifier  backs  up  the  simulation  to  the  state  it  was  in  at  the  start  of  this 
round  and  restarts  the  entire  round  (verifier  simulating  the  prover). 


End. 


In  an  expeeted  2  passes  through  each  round,  the  verifier  will  obtain  the 
information  without  the  help  of  the  prover.  Thus  the  interaction  does  not  help  the  verifier 
do  something  with  the  prover  in  expected  polynomial  time  that  he  could  not  as  well  have 
done  without  the  prover  in  expeeted  polynomial  time.  ■ 


While  we  have  now  proven  that  Protoeol  B  is  a  zero -knowledge  proof  system,  we  must 
also  determine  the  level  of  difficulty  of  the  GPP  given  the  amount  of  information  that  is  revealed 
in  the  problem.  The  modification  to  the  protocol  from  weighted  to  unweighted  graphs  does  not 
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affect  the  NP-completeness  of  the  problem,  as  discussed  previously.  However,  many  entries  of 
the  adjacency  matrix  have  been  revealed  and  could  possibly  make  it  easy  for  an  eavesdropper  to 
solve  the  problem  instance. 

Consider  Protocol  B,  as  illustrated  in  Figure  5-3.  If  too  many  entries  must  be  revealed  by 
the  prover,  then  the  isomorphism  may  be  discovered  easily  by  the  verifier  using  an  effective 
graph  isomorphism  algorithm.  Let  G  be  a  graph  with|G|  =  n,  \\G\\  =  m,\^\  =  k  and  \P\  =- 

for  P  G  p,  where  p  is  the  partition  (as  in  the  set  of  partition  classes).  For  the  verifier  to  check 
that  the  prover’s  answer  is  valid,  the  verifier  must  see  the  adjacency  matrix  entries  for  all  edges 
and  nonedges  between  partition  classes.  Thus  the  number  of  entries  that  will  be  revealed  to  the 
verifier  is: 

=  ,25) 

From  this,  we  can  see  that  the  minimum  possible  number  of  entries  that  need  to  be 
revealed  is  n^/ 2,  as  is  the  case  in  the  example  above  when  k  =  2,  but  that  the  maximum 
possible  number  of  entries  can  be  as  high  as  n(n  —  V)  —  v?  —  n,  in  the  case  where  k  —  n.  It  is 
important  to  note  that  the  number  of  entries  revealed  is  entirely  dependent  on  k  when  the  graph 
G  is  fixed,  and  also  that  the  problem  does  not  appear  to  increase  in  difficulty  when  k  is 
increased. 

In  determining  how  useful  Protocol  B  is,  we  must  consider  the  number  of  bits  to  be 
transferred  in  each  round.  Since  the  graphs  we  are  considering  in  this  example  are  simple, 
undirected  graphs,  the  adjacency  matrices  will  be  symmetric  with  zeros  along  the  diagonal  and 
with  all  entries  either  0  or  1 .  Thus  the  prover  only  needs  to  transmit  (”)  entries  of  A  to  the 
verifier.  Hence  step  3  requires  the  transmission  of  (”)  committed  entries,  each  of  which  is  one 
bit.  In  step  5,  the  verifier  sends  one  bit.  If  c  =  0,  the  prover  must  send  the  isomorphism  n.  We 
can  send  this  in  list  form,  and  so  we  will  need  n  log2  n  bits.  If  c  =  1,  the  prover  must  send  the 
decommitment  information  as  specified  in  the  protocol. 

Adding  everything  up  and  not  including  what  is  needed  for  commitment,  the  total  number 
of  bits  sent  will  be: 

Q) -f  1 -f  n  log2  n  (26) 

If  the  maximum  amount  of  information  to  be  transmitted  is  10  kilobits,  then  we  must  have: 

g)  +  1  +  n  log2  n  <  10000  (27) 

n  <  134  (28) 

The  largest  graph  to  be  considered  could  have  at  most  134  vertices  under  the  given  restriction. 
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3.3.3  Minimum  Label  Spanning  Tree 

The  minimum  label  spanning  tree  (MLSTP)  is  stated  as  follows:  Given  a  graph  G  with  labeled 
edges,  find  a  minimum  spanning  tree  that  uses  the  fewest  number  of  labels  possible.  In  other 
words,  given  a  graph  G  =  (V,  E,  1),  with  vertex  set  V,  edge  set  E,  and  edge  label  set  I,  find  an 
acyelie  connected  sub-graph  T  ^  G  such  that  \Ly\  is  minimized,  where  \Ly\  =  {c  E  l\  Be  E 
E(T)  with  /(e)  =  c).  In  the  example  graph  in  Figure  19,  V  —  {a,  b,  c,  d],  E  =  [uv:  u,v  E  V], 
and  /  =  {1,2}.  There  are  many  spanning  trees  to  consider  in  the  graph  shown.  It  is  clear  that  to 
include  vertex  d  in  the  spanning  tree,  at  least  one  edge  with  label  2  must  be  included.  Thus  a 
minimum  labeling  spanning  tree  is  T  =  (Yt,  Ej,  Ij),  where  Vj  —  {a,  b,  c,  d],  Ej  —  {da,  db,  dc], 
and  If  —  {2}.  This  tree  is  shown  in  red  in  the  figure.  We  should  note  that  in  the  example 
illustrated  it  is  clear  where  the  spanning  tree  lies  in  the  original  graph,  as  only  one  vertex  has 
three  incident  edges  with  the  same  label.  In  order  to  make  the  private  input  as  safe  as  possible,  it 
is  important  to  distribute  the  edge  labels  as  consistently  as  possible. 

The  minimum  label  spanning  tree  problem  is  an  NP-complete  problem  when  we  rephrase 
it  as  a  decision  problem.  In  fact,  it  has  been  proven  that  no  polynomial-time  approximation 
algorithm  with  a  constant  approximation  ratio  can  exist  unless  P  =  NP.  The  MLSTP  has  many 
real-world  applications,  such  as  communications  networks.  These  kinds  of  networks  can  use 
several  different  types  of  communications  mediums,  such  as  cable,  telephone  lines,  etc.  Solving 
the  MLSTP  can  give  a  spanning  network  using  as  few  different  mediums  as  possible  (Chang  and 
Leu  1997). 


Figure  19:  An  example  of  the  minimum  label 
spanning  tree  problem 
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3.3.3.1  Algorithms 


There  are  several  popular  algorithms  for  solving  the  MLSTP.  The  most  popular  algorithm  until 
2005  was  MVCA,  the  maximum  vertex  eovering  algorithm  (Consoli,  The  Development  and 
Applieation  of  Metaheuristies  for  Problems  in  Graph  Theory:  A  Computational  Study  2008). 

The  MVCA  was  introdueed  in  the  paper  that  first  deseribed  the  MLSTP  (Chang  and  Leu  1997). 
This  approximation  algorithm  produees  a  solution  that  is  no  greater  than  (1  +  2  logn)  times  the 
optimal.  It  has  also  been  proven  that  for  any  graph  with  label  frequeney  bounded  by  some  value 
b,  the  worst-ease  bound  of  MVCA  is  Hi,  =  ^.1=1  the  h*  harmonic  number  (Xiong,  Golden  and 
Wash,  Worst-Case  Behavior  of  the  MVCA  Heuristic  for  the  Minimum  Labeling  Spanning  Tree 
Problem  2005). 

Another  algorithm  that  appears  frequently  in  the  literature  is  a  metaheuristic  algorithm 
called  the  Pilot  Method.  The  Pilot  Method  improves  upon  another  heuristic  algorithm  (such  as 
MVCA)  using  repetition  and  a  look-ahead  strategy  (Vofi  and  Duin  2003).  While  the  Pilot 
Method  will  perform  at  least  as  well  as  the  heuristic  algorithm  that  it  implements  (if  not  better),  it 
is  often  quite  time  consuming  because  of  its  repetitive  nature. 

Other  algorithms  for  the  MLSTP  include  genetic  algorithms  (Xiong,  Golden  and  Wash, 

A  One-Parameter  Genetic  Algorithm  for  the  Minimum  Labeling  Spanning  Tree  Problem  2005), 
tabu  search  algorithms,  and  a  more  recent  hybrid  algorithm.  It  appears  that  the  best  performing 
algorithms  are  VNS  (Variable  Neighborhood  Search)  and  GRASP  (Greedy  Randomized 
Adaptive  Search  Procedure),  which  were  introduced  in  2009  (Consoli,  Draby-Downman,  et  al. 
2009).  There  is  also  a  set  of  benchmark  instances  that  are  maintained  by  Sergio  Consoli^. 


3.3.3.2  Creating  a  Zero-Knowledge  Proof  System 

Consider  the  interactive  proof  system  that  is  illustrated  in  Figure  20.  While  the  protocol  satisfies 
the  completeness  and  soundness  properties  of  an  interactive  proof  system,  it  does  not  satisfy  the 
zero-knowledge  property.  In  the  prover’s  final  step,  the  edges  corresponding  to  the  spanning  tree 
are  revealed.  If  ITt’I  >  1,  then  the  verifier  learns  how  many  edges  have  the  same  labels.  While 
the  verifier  does  not  know  which  group  of  edges  corresponds  to  which  label,  the  prover  is  still 
transmitting  information  that  the  verifier  could  not  have  discovered  using  a  simulator. 

The  next  logical  question  to  consider  is  whether  we  can  restrict  the  spanning  tree  so  that 
ILt’I  =  1  in  order  to  satisfy  the  zero-knowledge  property.  Since  all  trees  have  n  —  1  edges,  the 
verifier  would  then  already  know  how  many  edges  have  the  same  label.  However,  the  problem 
then  becomes  too  easy  to  base  a  secure  protocol  on.  For  example,  consider  the  basic  algorithm 
illustrated  in  Figure  21.  If  we  use  any  efficient  algorithm  for  finding  a  spanning  tree 
(MinSpanTree),  most  of  which  run  in  polynomial  time,  then  the  problem  is  easily  solvable  in  an 


^Available  at:  http://www.sergioconsoli.com/MLSTP.htm 
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efficient  manner.  Thus  in  order  to  use  the  MLSTP,  we  must  first  develop  a  better  zero- 
knowledge  proof  system. 


Common  Input:  A  labeled  graph  G  (shown  in  Figure  6-4),  and  the  number  of  distinct  labels  in  a  min.  label 
spanning  tree  (|Lr  I  =  1  in  the  example). 


Private  Input:  The  min.  label  spanning  tree,  T  (shown  in  Figure  4  in  red). 


Prover 


Verifier 


1 .  Creates  a  permutation  n  of  V(G). 

2.  Chooses  a  permutation  0  of  the  set  of  labels. 

3.  Creates  an  adjacency  matrix  A  for  nijpiGy). 

4.  Sends  a  commitment  to  A  to  the  verifier. 

0  111 
10  2  1 
12  0  2 
12  2  0 


(commitment) 


A  = 


A  = 


5.  Chooses  a  random  bit  c. 

6.  Sends  c  to  the  prover. 


0 

1 

1 

1 

1 

0 

2 

1 

1 

2 

0 

2 

1 

2 

2 

0 

c  =  0 

7.  Sends  n  and  the  decommitment 
information  for  A  to  the  verifier.  ■ 

8.  Checks  that  T  was  formed  correctly. 

c  =  1 

7.  Sends  the  decommitment 
information  for  the  entries  of  A 
corresponding  to  edges  in  T.  * 

^  A  = 

8.  Checks  that 
spanning  tre 

0  111 

10  2  2 

12  0  2 

12  2  0 

the  entries 

3  using  \Lj\  la 

torrespond  to  a 
lels. 

Figure  20:  An  interactive  proof  system  for  the  minimum  label  spanning  tree  problem  example 


for  i  E  l(^G) 

Ei  =  {e  e  EiG):l(e)  =  t} 
Gi  =  (VCGlEt) 


if  Gj  is  connected 

return  MinSpanTree  (Gj) 
exit  / /  Exit  both  loops 

end  if 
end  for 


Figure  21:  An  algorithm  for  the  minimum  label 
spanning  tree  problem  with  one  label 


Any  zero-knowledge  proof  system  for  the  MLSTP  needs  to  check  the  following  facts: 
(1)  iLjl  labels  are  used  on  T 
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(2)  T  is  acyclic 

(3)  T  is  connected 

(4)  T  is  spanning 

It  is  possible  to  cheek  (3)  and  (4)  simultaneously  by  having  the  verifier  request  two 
vertiees  and  requiring  the  prover  to  show  a  path  in  T  between  those  two  vertices.  However  this 
will  give  the  verifier  information  on  the  spanning  tree  that  could  not  have  been  obtained  without 
the  help  of  the  prover.  The  verifier  ean  eheck  (2)  by  requesting  that  the  prover  show  that 
||r||  =  n  —  1,  where  n  is  the  number  of  vertices  in  G.  Again,  we  arise  at  the  problem  of 
determining  how  the  prover  ean  reveal  the  number  of  edges  in  T  without  giving  away  any  of  the 
strueture  of  the  tree.  Lastly,  the  problem  of  proving  (1)  is  going  to  be  the  most  diffieult  in  terms 
of  preserving  the  zero-knowledge  property  in  the  proof  system.  It  will  require  a  more  ereative 
approaeh  to  eonstruct  a  zero-knowledge  proof  system  for  the  MLSTP  than  what  we  have 
considered  so  far. 

3.3.3.3  Coping  with  Weighted  Graphs 

So  far  our  work  on  zero-knowledge  proof  systems  has  dealt  with  only  unweighted  graphs,  i.e.  all 
edge  weights  are  either  0  or  1 ,  corresponding  to  nonedges  and  edges  respectively.  When  edge 
weights  are  introdueed  into  an  interactive  proof  system,  usually  the  completeness  and  soundness 
properties  are  preserved  but  the  zero-knowledge  property  is  not.  When  the  prover  reveals 
information  in  the  permuted  and  eommitted  adjacency  matrix  for  the  graph,  the  prover  is  not 
only  revealing  that  edges  exist  but  also  the  weights  of  the  edges.  This  allows  the  verifier  to 
discover  information  about  the  graph  that  could  not  possibly  have  been  computed  using  a 
simulator.  So  far,  it  does  not  appear  that  this  issue  has  been  addressed  in  the  literature. 

When  we  consider  the  deeision  version  of  the  minimum  label  spanning  tree  problem,  also 
known  as  the  bounded  label  spanning  tree  problem  (BLSTP),  the  original  proof  of  the  NP- 
completeness  of  the  problem  is  based  on  proving  that  if  MLSTP  is  easily  solved  then  the 
minimum  set  covering  problem  is  easily  solved  (Chang  and  Leu  1997).  Unfortunately,  as  there 
is  no  clear  way  to  convert  an  arbitrary  instance  of  MLSTP  into  another  known  NP-eomplete 
problem,  we  are  left  with  no  obvious  way  of  transforming  an  existing  zero-knowledge  proof 
system  for  the  class  NP  to  this  problem,  as  is  suggested  in  the  proofs  that  all  languages  in  NP 
have  zero-knowledge  protoeols  (Goldreieh,  Micali  and  Wigderson  1991). 

There  are  several  possible  options  for  ereating  zero-knowledge  proof  systems  for 
weighted  graph  problems,  however  none  of  these  options  has  been  espeeially  fruitful.  While 
some  of  the  options  have  worked  in  specifie  eases,  no  option  has  worked  in  every  case  and  there 
still  remain  problems  in  which  no  option  is  feasible  (MLSTP).  The  options  that  have  been 
considered  already  are  the  following: 
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1 .  Convert  the  base  problem  on  weighted  graphs  to  a  base  problem  on  unweighted 
graphs  by  ehanging  all  edge  weights  that  are  greater  than  one  to  edge  weight  one. 

In  some  problems,  sueh  as  the  minimum  label  spanning  tree  problem,  this  option 
can  make  the  base  problem  much  easier.  This  can  enable  a  cheater  to  break  the 
problem  instance  and  impersonate  a  trusted  party.  However,  this  solution  seems 
feasible  for  the  graph  partitioning  problem. 

2.  Convert  the  problem  instances  so  that  the  solutions  use  the  same  number  of  edges  of 
each  edge  weight  involved  and  include  this  number  in  the  common  input. 

This  is  not  always  a  realistic  possibility.  It  can  become  quite  cumbersome  to 
create  problem  instances  in  which  the  solutions  are  uniform,  and  it  can  also  make  the 
problem  instances  much  easier  for  cheaters  to  break  and  solve.  However,  this  option 
appears  to  work  well  for  the  graph  coloring  problem  (considering  vertex  weights 
instead  of  edge  weights). 

3.  Use  the  reduction  from  an  existing  NP-complete  problem  to  the  base  problem  (as  is 
done  in  a  standard  proof  of  NP-completeness)  to  transform  the  problem  into  one  that 
is  usable  in  an  existing  zero-knowledge  proof  system. 

Proofs  of  NP-completeness  show  two  facts.  The  first  is  that  the  base  problem  is 
in  the  class  NP.  The  second  fact  is  that  the  problem  is  harder  than  an  existing  NP- 
complete  problem,  i.e.  an  instance  of  the  existing  NP-complete  problem  is  true  if  and 
only  if  a  corresponding  instance  of  the  base  problem  is  true.  This  is  most  commonly 
accomplished  by  transforming  an  instance  of  the  existing  problem  into  some 
corresponding  instance  of  the  base  problem.  This  leaves  us  with  no  way  to 
transform  any  instance  of  the  base  problem  into  an  instance  of  the  existing  problem, 
and  hence  no  way  to  apply  a  zero-knowledge  proof  system  for  the  existing  problem 
to  the  base  problem.  However,  this  approach  works  well  for  converting  the  traveling 
salesman  problem  to  a  sub-graph  isomorphism  problem  (by  adding  k  vertices  along 
edge  e  with  /(e)  =  k  +  1  and  then  searching  the  new  unweighted  graph  for  a  cycle 
of  length  equal  to  the  length  of  a  minimum  TSP  tour  in  the  original  weighted  graph). 

Weighted  graphs  appear  to  greatly  complicate  the  zero-knowledge  proof  systems.  The 
three  options  discussed  above  clearly  are  not  perfect  solutions,  but  they  do  seem  to  work  for 
some  particular  problems.  It  is  worth  considering  whether  the  added  complication  is  worth  the 
trouble.  Either  the  base  problem  can  be  converted  to  an  unweighted  graph  by  adding  vertices 
and  edges  (which  increases  the  number  of  bits  sent  between  prover  and  verifier)  or  the  prover  is 
required  to  send  a  commitment  to  an  adjacency  matrix  that  is  no  longer  filled  with  only  O’s  and 
1  ’s  (which  again  increases  the  number  of  bits  sent).  In  the  first  case,  the  amount  of  information 
that  needs  to  be  transferred  increases,  while  the  problem  instances  themselves  may  not  be  more 
difficult  than  instances  of  a  similar  base  problem  on  unweighted  graphs.  In  the  second  case,  it 
becomes  much  more  difficult  to  satisfy  the  zero-knowledge  property. 
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4.  RESULTS  AND  DISCUSSION 


While  it  may  be  one  of  the  best  known  NP-eomplete  problems,  the  satisfiability  problem  is  not  a 
practical  base  problem  for  a  zero-knowledge  proof  system.  First,  the  amount  of  information  that 
is  required  to  be  computed  and  transferred  in  the  existing  protocol  is  very  large  compared  the 
protocols  that  exist  for  the  other  problems  discussed  in  this  report.  Second,  many  efficient 
solvers  exist  for  the  problem.  For  example,  the  solvers  tested  during  the  SAT  competition  are 
able  to  solve  instances  with  millions  of  variables  and  millions  of  clauses.  This  fact  coupled  with 
the  data  transfer  in  the  zero-knowledge  proof  system  discussed  makes  the  problem  very 
impractical  for  implementation.  Lastly,  as  of  yet  there  does  not  exist  a  method  for  generating 
hard  instances  of  the  satisfiability  problem.  Many  instances  that  are  known  to  be  difficult  were 
found  by  a  guess-and-check  process,  which  will  not  be  practical  for  use  in  a  secure  protocol.  We 
must  be  able  to  create  hard  instances  of  whatever  base  problem  is  selected. 

Graph  coloring  and  equitable  coloring  are  one  step  closer  to  being  practical  base 
problems  for  zero-knowledge  proof  systems  than  the  satisfiability  problem.  While  the 
probability  of  catching  a  cheating  prover  may  not  be  as  high  in  the  protocol  for  equitable  3- 
coloring  as  in  some  of  the  protocols  using  other  base  problems,  we  are  at  least  aware  of  methods 
for  creating  difficult  problem  instances.  A  difficult  problem  instance  is  one  in  which  the  existing 
algorithms  are  unable  to  solve  optimally  in  a  reasonable  amount  of  time.  The  set  of  graphs 
introduced  by  The  Second  DIMACS  Implementation  Challenge  (1992-1993)  seems  to  contain 
some  difficult  classes  of  graph  coloring  instances.  These  difficult  instances  would  enable  the 
graph  coloring  problem  to  be  a  good  base  problem  for  a  zero-knowledge  proof  system,  but  a 
stronger  zero-knowledge  proof  system  in  which  a  cheater  is  more  easily  discovered  must  be 
developed. 

Out  of  the  problem  classes  discussed  in  this  report,  the  sub-graph  isomorphism  class 
appears  to  be  the  most  promising.  In  particular,  the  longest  path  problem  and  the  sub-graph 
isomorphism  problem  itself  seem  to  have  the  most  potential.  Currently  there  do  not  exist  any 
extremely  efficient  solvers  for  the  longest  path  problem,  and  all  sub-graph  isomorphism  class 
problems  have  a  zero-knowledge  proof  system  with  a  probability  of  catching  a  cheating  prover, 
taking  only  7  rounds  to  achieve  a  confidence  level  of  99%.  The  protocols  are  also  efficient 
compared  to  the  existing  protocols  for  other  classes  of  base  problems  in  terms  of  the  amount  of 
data  transferred  between  prover  and  verifier.  Overall,  the  problems  in  the  sub-graph 
isomorphism  class,  with  the  exception  of  graph  isomorphism,  seem  to  have  the  most  potential. 

While  the  sub-graph  isomorphism  class  is  emerging  as  a  useful  set  of  base  problems  for 
zero-knowledge  proof  systems,  there  is  still  work  to  be  done.  More  testing  needs  to  be  done  on 
the  efficiency  of  the  algorithms  for  the  sub-graph  isomorphism  problem  and  its  subproblems  in 
order  to  determine  the  lower  bound  on  the  size  of  the  problem  instance  for  a  difficult  problem. 
We  must  also  determine  which  graph  structures  are  capable  of  producing  the  hardest  instances  in 
the  sub-graph  isomorphism  class.  Is  the  average  instance  of  the  longest  path  problem  harder  than 
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the  average  ease  of  the  minimum  bandwidth  problem?  Whieh  of  the  problems  ean  we  develop 
diffieult  instanees  for  in  a  eonsistent  manner? 

Last,  but  not  least,  we  must  eonsider  the  latest  problems  in  this  area.  For  example,  the 
minimum  label  spanning  tree,  introdueed  in  1997,  eould  be  a  promising  base  problem  for  a  zero- 
knowledge  proof  system.  However,  to  be  able  to  utilize  this  diffieult  problem,  we  must  first 
ereate  a  valid  interaetive  proof  system  for  the  problem  that  satisfies  the  zero-knowledge  property. 
The  ereation  of  a  zero-knowledge  proof  system  involving  weighted  graphs  will  allow  us  to 
eonsider  many  more  graph  theoretie  problems  that  are  eurrently  unusable  as  base  problems. 
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5.  CONCLUSIONS  AND  FUTURE  WORK 


Zero-knowledge  proof  systems  have  many  characteristics  that  are  desirable  for  determining 
trustworthy  parties  in  an  airborne  networking  environment.  One  approach  is  to  base  zero- 
knowledge  proof  systems  on  the  instances  and  solutions  of  NP-complete  problem.  This  report 
has  investigated  this  approach  with  a  focus  on  the  graph  theory  problems  within  the  NP-complete 
and  NP-hard  classes. 

Future  research  in  this  area  must  focus  application  driven  requirements  associated  with 
airborne  mobile  adhoc  networks.  Protocols  used  for  authentication  of  user  identity,  and  establishment 
of  mutual  tmst,  cannot  constrain  either  the  movement  of  information  or  the  movement  of  systems 
anywhere  in  the  battlespace.  Successful  implementation  of  ZKP -based  authentication  protocols  will 
require  that  there  be  a  positive  impact  on  both  network  connectivity  and  network-user  operations.  The 
efficiency  and  effectiveness  of  I/A  protocols  therefore  need  to  be  considered  against  realistic  scenarios. 
MANETs  by  definition  are  not  static,  their  configuration  change  over  time;  network  connections  and 
information  routing  paths  change  when  nodes  are  added  to,  or  removed  from,  the  network  as  new  user 
groups  form  or  nodes  are  compromised.  Mitigating  factors  such  as  time -sensitivity  of  the  I/A  process, 
communication  channel  bandwidth  and  quality,  network  dynamics  and  data  flows,  user  security  access 
requirements,  and  so  on,  all  need  to  be  accounted  for  when  gauging  protocol  viability. 
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7.  LIST  OF  SYMBOLS  AND  ABBREVIATIONS 


Symbols 

A  and 

V  or 

«  is  isomorphic  to 

[/c]  the  integers  1  through  k 

-^a  for  Boolean  variable  a,  the  eomplement  of  a 

S\T  for  sets  S  and  T,  {s  E  S:  s  ^  T] 

E{G')  the  set  of  edges  of  graph  G 

G  =  (V,  E)  a  graph  with  vertex  set  V  and  edge  set  E 

G  the  eomplement  of  graph  G 

I G I  the  number  of  vertiees  in  graph  G 

II G II  the  number  of  edges  in  graph  G 

G(n,  p)  the  Erdos-Renyi  model  random  graph  on  n  vertiees  with  edge  probability  p 

G[5]  the  sub-graph  of  G  indueed  by  the  set  S  ^  E(G) 

P/j  a  path  with  k  edges 

a  path  P/j  with  additional  edges  added  between  every  pair  of  vertiees  x,y  sueh 
that  the  distanee  between  x  and  y  in  the  path  P/j  is  at  most  n 

V (G)  the  set  of  vertiees  of  graph  G 

A(G)  the  maximum  degree  of  the  graph  G 

710  for  permutations  tt  and  0,  equivalent  to  0  o  tt 

X(G^  the  ehromatie  number  of  the  graph  G 

Abbreviations 

3-SAT 
AN 
APX 

BLSTP 
DIMACS 
DLS 
ESC 
G3C 
GA 
GCP 
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satisfiability  problem  eonsisting  of  elauses  with  three  variables 
airborne  network 

the  elass  of  optimization  problems  with  polynomial-time  approximation 
algorithms  with  approximation  ratio  bounded  by  a  eonstant 

bounded  label  spanning  tree  problem 

eenter  for  Diserete  Mathematics  and  Theoretical  Computer  Science 

dynamie  loeal  search 

equitable  3 -coloring  problem 

graph  3-eoloring  problem 

genetie  algorithm 

graph  elustering  problem 


GIP 

graph  isomorphism  problem 

GNI 

graph  non-isomorphism  problem 

GPP 

graph  partitioning  problem 

GRASP 

greedy  randomized  adaptive  search  procedure 

HCP 

Hamiltonian  cycle  problem 

ISP 

independent  set  problem 

KIS 

the  /c-independent  set  problem 

LPP 

longest  path  problem 

MANET 

mobile  ad  hoc  network 

MBP 

minimum  bandwidth  problem 

MCP 

maximum  clique  problem 

MLSTP 

minimum  label  spanning  tree  problem 

MVCA 

maximum  vertex  covering  algorithm  of  Chang  and  Eeu  (1997) 

NP 

the  class  of  nondeterministic  polynomial  problems 

P 

the  class  of  deterministic  polynomial  problems 

QRA 

quadratic  residuosity  assumption 

RES 

reactive  local  search 

SA 

simulated  annealing 

SAT 

satisfiability  problem 

SGI 

sub-graph  isomorphism  problem 

TSP 

traveling  salesman  problem 

VNS 

variable  neighborhood  search 

vss 

variable  space  search 

ZKP 

zero-knowledge  proof  system 
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NP-Complete  Graph  Problems 
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General  Papers  on  ZKP  Background 
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Graph  Isomorphism 


6.  G.  Brassard,  C.  Crepeau.  “Non-Transitive  Transfer  of  Confidence:  A  Perfect  Zero- 
Knowledge  Interactive  Protocol  for  SAT  and  Beyond.”  Proc.  of  the  27*^  Annual 
Symp.  on  Foundations  of  Computer  Science:  188-195,  1986. 

Notes:  Introduces  an  idea  for  a  ZKP  for  graph  isomorphism  based  on  the  assumption 
that  arbitrarily  hard  instances  of  the  problem  exist.  States  that  the  protocol  will  be 
formalized  in  a  later  paper. 

7.  D.  Conte,  P.  Eoggia,  C.  Sansone,  and  M.  Vento.  “Thirty  Years  of  Graph  Matching  in 
Pattern  Recognition.  International  Journal  of  Pattern  Recognition  and  Artificial 
Intelligence  (World  Scientific  Publishing  Company)  18,  no.  3(2004):  265-298. 
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Notes:  Discusses  many  of  the  graph  and  sub-graph  isomorphism  algorithms  that 
existed  at  the  time  of  publication.  Also  shows  many  applications  of  the  problems  and 
algorithms. 

8.  L.  P.  Cordelia,  P.  Foggia,  C.  Sansone,  M.  Vento.  “A  (Sub)Graph  Isomorphism 
Algorithm  for  Matching  Large  Graphs.”  IEEE  Transactions  on  Pattern  Analysis  and 
Machine  Intelligence,  26(10):  1367-1372.  Oct.  2004. 

Notes:  Introduces  and  describes  the  VF2  algorithm.  Compares  VF2  with  Nauty  and 
Ullman’s  algorithm  on  the  graph  isomorphism  problem  with  input  graphs  that  are 
randomly  connected,  2D  mesh,  or  bounded  valence  graphs. 

9.  P.  Foggia,  C.  Sansone,  M.  Vento.  “A  Performance  Comparison  of  Five  Algorithms 
for  Graph  Isomorphism.”  Proc.  of  the  3'^^  lAPR  TC-15  Workshop  on  Graph  Based 
Representations  in  Pattern  Recognition:  I88-I99,  2001. 

Notes:  Compares  VF2,  Nauty,  and  Ullman’s  algorithm  on  benchmark  sets  of  graphs 
(tested  on  randomly  connected,  2D  mesh,  and  bounded  valence  graphs).  Contains 
many  graphs  and  plots  of  the  results. 

10.  P.  Foggia.  The  VELib  Graph  Matching  Library,  version  2.0.  March  2001.  Available 
at  http://amalFi.dis.unina.it/graph/db/vlib.html  (accessed  May  24,  2010). 

Notes:  The  home  of  the  VF2  algorithm.  The  C++  code  is  publicly  available  at  this 
site. 

11.  S.  Fortin.  “The  Graph  Isomorphism  Problem.”  Technical  Report  TR  96-20: 
University  of  Alberta,  July  1996. 

Notes:  A  description  of  the  graph  isomorphism  problem  with  a  description  of  some 
invariants  under  isomorphism  that  can  be  used  to  reduce  the  search  space.  Also 
discusses  Nauty  and  tests  the  program  with  a  few  specific  types  of  graphs. 

12.  O.  Goldreich.  Foundations  of  Cryptography:  Fragments  of  a  Book.  Weizmann 
Institute  of  Science:  1 995 . 

Notes:  Presents  a  perfect  zero-knowledge  proof  for  the  graph  isomorphism  problem. 
Goes  through  a  formal  and  thorough  proof  that  the  protocol  presented  is  a  zero- 
knowledge  proof  system  using  simulators. 
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problem.  Contains  a  thorough  discussion  and  proof  that  the  protocol  is  a  perfect  zero- 
knowledge  proof  system.  Discusses  a  modification  to  the  protocol  to  enable  parallel 
execution  instead  of  sequential. 
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on  Graphs,  Groups,  or  Rings.”  CoRR:  2008. 
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Notes:  Introduces  and  discusses  the  variable  neighborhood  search  (VNS)  algorithm 
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zero-knowledge  proof  system  for  3-colorability. 
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Notes:  Outlines  a  noninteractive  zero -knowledge  proof  of  3-colorability  based  on  the 
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Tabu  search.  Tests  and  compares  the  new  algorithm  on  DIMACS  benchmarks 
against  XRLF,  EDM,  and  Fleurent  and  Ferland’s  algorithm.  While  the  algorithm  is 
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the  performance  of  the  algorithm. 
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Notes:  Introduces,  describes,  and  discusses  Amacol.  Compares  and  tests  Amacol 
against  Tabucol,  GH,  DSATUR,  Eong  TABU,  and  Short  TABU  on  the  DIMACS 
graphs.  Concludes  that  Amacol  is  competitive  with  the  existing  algorithms. 
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proof  that  the  protocol  is  in  fact  a  zero-knowledge  interactive  proof,  as  well  as  a 
discussion  of  how  to  construct  constant-round  zero-knowledge  proof  systems  for  the 
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Sequential  Coloring.  Compares  the  runtimes  and  quality  of  solution  for  these 
algorithms  by  testing  on  random  graphs  on  60  vertices  with  different  edge-densities. 
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considered. 
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Notes:  Introduces  and  defines  the  concept  of  zero-knowledge  proof  systems. 
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edge  at  random.  Discusses  why  this  is  a  zero-knowledge  proof  system. 
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Tabu  search,  dynamic  local  search,  and  iterative  local  search  (and  others).  Discusses 
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Notes:  Looks  at  the  graph  edge-coloring  problem  on  random  digraphs  by  its 
inversion  problem:  color  the  edges  of  the  graph  so  as  to  obtain  the  specified  coloring. 
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efficiency/complexity  of  the  protocol. 
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set  problem.  Outlines  and  proves  zero  knowledge  protocols  for  both  of  these 
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the  independent  set  problem. 
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Notes:  Describes,  outlines,  and  discusses  the  Tabaris  algorithm  for  finding  a 
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2008. 
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clique  problem. 
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115.  E.  Marchiori.  “A  Simple  Heuristic  Based  Genetic  Algorithm  for  the  Maximum 
Clique  Problem.”  Proc.  of  the  1998  ACM  Symp.  on  Applied  Computing:  366-373, 
1998. 

Notes:  Introduces  the  HGA  algorithm  and  tests  it  on  the  DIMACS  benchmark  graphs 
for  the  maximum  clique  problem  against  tabu  search  algorithms  and  the  genetic 
algorithm  GMCA.  Concludes  that  HGA  is  competitive  with  the  algorithms  tested. 

116.  P.R.J.  Ostergard.  “A  Fast  Algorithm  for  the  Maximum  Clique  Problem.”  Discrete 
Applied  Mathematics,  120:  197-207,2002. 

Notes:  Introduces  a  branch-and-bound  algorithm  using  a  vertex  order  from  a  coloring 
as  well  as  pruning  strategies.  Tests  the  algorithm  on  some  of  the  DIMACS 
benchmark  graphs  for  the  maximum  clique  problem  and  also  on  random  graphs. 

117.  W.  Pullan,  H.H.  Hoos.  “Dynamic  Local  Search  for  the  Maximum  Clique  Problem.” 
Journal  of  Artificial  Intelligence  Research,  25:  159-185,  2006. 

Notes:  Introduces  DLS-MC  (stochastic  local  search  algorithm).  Describes  the  five 
current  best  heuristic  algorithms.  Contains  results  on  testing  DLS-MC  on  all  80 
DIMACS  instances  for  the  maximum  clique  problem.  Compared  DLS-MC  with 
DAGS,  GRASP,  k-opt,  RLS,  GENE,  ITER,  and  QUALEX-MS. 

118.  F.  Rossi,  S.  Smriglio.  “A  Branch-and-Cut  Algorithm  for  the  Maximum  Cardinality 
Stable  Set  Problem.”  Operations  Research  Letters,  28:  63-74,  2001. 

Notes:  Introduces  a  new  branch-and-cut  algorithm  for  the  independent  set  problem. 
Tests  the  algorithm  on  DIMACS  benchmark  graphs  for  the  maximum  clique  problem 
and  compares  it  with  other  branch-and-bound  algorithms. 

119.  E.  Tomita,  T.  Kameda.  “An  Efficient  Branch-and-Bound  Algorithm  for  Finding  a 
Maximum  Clique  with  Computational  Experiments.”  Journal  of  Global 
Optimization,  31:  95-111,2007. 

Notes:  Introduces  MCR  algorithm,  which  uses  approximate  coloring  and  sorting  of 
the  vertices.  Tests  the  algorithm  on  random  graphs  up  to  15,000  nodes  and  DIMACS 
benchmark  graphs  for  the  maximum  clique  problem  against  dfimax.  New,  and 
COCR(COC)  algorithms. 
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120.  E.  Tomita,  T.  Seki.  “An  Efficient  Branch-and-Bound  Algorithm  for  Binding  a 
Maximum  Clique.”  Lecture  Notes  in  Computer  Science,  2731:  278-289,  2003. 

Notes:  Introduees  the  algorithm  MCQ,  which  is  approved  upon  later  by  the  algorithm 
MCR.  Contains  testing  done  on  DIMACS  graphs  for  the  maximum  elique  problem 
against  dfmax,  New,  and  COCR  algorithms. 

121.  Q.  Zhang,  J.  Sun,  E.  Tsang.  “An  Evolutionary  Algorithm  with  Guided  Mutation  for 
the  Maximum  Clique  Problem.”  IEEE  Transactions  on  Evolutionary  Computation, 
9(2):  192-200,  April  2005. 

Notes:  Introduees  the  EA/G  algorithm.  Tests  EA/G  on  the  DIMACS  benchmark 
graphs  for  the  maximum  clique  problem  against  HGA  and  MIMIC.  Concludes  that 
EA/G  is  eompetitive  with  the  other  algorithms  considered. 

Satisfiability 


122.  “Satisfiability  Testing  or  How  to  Solve  Sudoku  Puzzles  -  The  DPLE  Method.”  Prom 
the  International  Center  for  Computational  Eogic.  Available  at: 
http://www.eomputational-logie.org/ieeEmaster/leetures/summer07/sat/slides/dpll.pdf 

Notes:  Gives  a  deseription  of  the  satisfiability  problem  and  an  overview  of  the  DPEE 
method  for  solving  instances  of  the  satisfiability  problem. 

123.  G.  Brassard,  C.  Crepeau.  “Non-Transitive  Transfer  of  Confidenee:  A  Perfect  Zero- 
Knowledge  Interactive  Protocol  for  SAT  and  Beyond.”  Proc.  of  the  27'^  Annual 
Symp.  on  Eoundations  of  Computer  Science:  188-195,  1986. 

Notes:  Outlines  a  basic  zero-knowledge  proof  system  for  the  satisfiability  problem. 
Pocuses  on  the  eommitment  scheme  used. 

124.  G.  Brassard,  D.  Chaum,  C.  Crepeau.  “Minimum  Disclosure  Proofs.”  Journal  of 
Computer  and  System  Sciences,  37(2):  156-189,  Oct.  1988. 

Notes:  Describes  and  illustrates  a  zero-knowledge  proof  system  for  the  satisfiability 
problem.  Discusses  how  the  protoeol  satisfies  the  properties  neeessary  for  a  zero- 
knowledge  protoeol. 
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125.  S.  Cook,  D.  G.  Mitchell.  “Finding  Hard  Instances  of  the  Satisfiability  Problem;  A 
Survey.”  DIMACS  Series  in  Discrete  Mathematics  and  Theoretical  Computer 
Science,  35:  1-17,1997. 

Notes:  Details  the  algorithms  DPLL,  GSAT,  and  WalkSAT.  Also  does  some  testing 
and  analysis  of  the  algorithms.  Has  some  discussion  on  the  construction  of  hard 
satisfiability  instances. 

126.  1.  Damgard.  “Non-lnteractive  Circuit  Based  Proofs  and  Non-Interactive  Perfect  Zero- 
Knowledge  with  Preprocessing.”  Proc.  of  Eurocrypt:  341-355,1992. 

Notes:  Thoroughly  outlines  a  noninteractive  proof  system  for  the  satisfiability 
problem  and  proves  that  it  satisfies  the  necessary  properties.  Proves  that  the  proof 
system  is  zero-knowledge  under  the  QRA. 

127.  C.  Dwork,  U.  Feige,  J.  Killian,  M.  Naor,  M.  Safira.  “Low  communication  2-prover 
zero-knowledge  proofs  for  NP.”  Proc.  of  the  12*’^  Annual  International  Cryptology 
Conference  on  Advances  in  Cryptology:  215-227,  1992. 

Notes:  Discusses,  outlines,  and  proves  a  zero-knowledge  proof  system  for  the 
satisfiability  problem  with  two  provers  and  one  verifier. 

128.  B.  Ferris,  J.  Froehlich.  “WalkSAT  as  an  Informed  Heuristic  to  DPLL  in  SAT 

Solving.”  Artificial  Intelligence  Graduate  Course  taught  by  Professor  Dan  Weld: 
2005.  Available  at: 

http://www.cs.washington.edu/homes/jfroehli/publications/WalkSAT-DPLL.pdf 

Notes:  Compares  WalkSAT,  a  stochastic  local  search  algorithm,  with  DPLL,  a 
systematic  search  algorithm.  WalkSAT  is  faster,  but  incomplete  (cannot  prove 
unsatisfiability),  while  DPLL-type  algorithms  are  complete  but  slower. 

129.  E.  Goldberg,  Y.  Novikov.  “BerkMin:  A  Fast  and  Robust  SAT-Solver.”  Proc.  of 
DATE  ’02:  142-149,  2002. 

Notes:  Compares  a  new  algorithm  (BerkMin)  with  GRASP,  SATO,  and  Chaff,  which 
it  is  based  off  of.  Tests  BerkMin  against  these  other  satisfiability  problem  solvers  and 
concludes  that  BerkMin  is  more  robust  (can  solve  more  instances),  but  is  not  always 
faster. 
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130.  H.H.  Hoos,  T.  Stutzle.  “Local  Search  Algorithms  for  SAT;  An  Empirical 
Evaluation.”  Journal  of  Automated  Reasoning,  2A{A)\  421-481,2000. 

Notes:  Introduees,  discusses,  eompares  and  evaluates  the  stoehastie  loeal  seareh 
algorithms  WalkSAT  and  GSAT  thoroughly. 

131.  H.  Jia.  “Hard  Instanees  with  Hidden  Solutions.”  PhD  Dissertation,  University  of 
New  Mexieo:  Deeember,  2007. 

Notes:  Introduetion  to  several  algorithms  that  exist  for  solving  instanees  of  the 
satisfiability  problem  and  3-SAT,  as  well  as  a  proposed  method  for  generating 
diffieult  test  eases  for  these  algorithms.  Algorithms  deseribed:  DPEE,  WalkSAT, 
zChaff,  and  SP. 

132.  J.  Marques-Silva.  “The  Impact  of  Branching  Heuristics  in  Propositional  Satisfiability 
Algorithms.”  Proc.  of  the  Portuguese  Conference  on  Artificial  Intelligence: 
Progress  in  Artificial  Intelligence:  62-74,  1999. 

Notes:  Deseribes  several  branehing  heuristics  that  are  used  in  effeetive  satisfiability 
solvers  sueh  as  GRASP,  SATO,  and  rel  sat.  Runs  tests  on  these  algorithms  against 
other  algorithms  that  do  not  use  the  same  techniques  to  examine  their  effeetiveness. 

133.  J.P.  Marques-Silva,  K.A.  Sakallah.  “GRASP:  A  Search  Algorithm  for  Propositional 
Satisfiability.”  IEEE  Transactions  on  Computers,  48(5):  506-521,  May  1999. 

Notes:  Introduees,  outlines,  and  discusses  the  GRASP  algorithm  for  solving  the 
satisfiability  problem.  Contains  experimental  results  obtained  from  testing  GRASP 
against  several  other  well-known  algorithms  such  as  DPEE,  GSAT,  etc. 

134.  M.W.  Moskewicz,  C.E.  Madigan,  Y.  Zhao,  E.  Zhang,  S.  Malik.  “Chaff;  Engineering 
an  Effieient  SAT  Solver.”  Proceedings  of  the  38‘^  Conference  on  Design 
Automation:  530-535,2001. 

Notes:  Deseribes  the  Chaff  algorithm  for  solving  the  satisfiability  problem.  Also 
includes  a  deseription  of  the  DPEE  algorithm  as  eomparison,  with  some  eomments  on 
other  eurrently  popular  algorithms. 
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135.  D.N.  Pham,  C.  Gretton.  “gNovelty"^.”  From  the  SAT  2007  Competition  web  site. 
Available  at:  http://www.sateompetition.org/2007/gNovelty+.pdf 

Notes:  Introduces  and  discusses  the  satisfiability  solver  gNovelty^,  based  off  of  the 
first  and  second  place  winners  in  the  random  category  of  the  2005  SAT  competition. 
The  program  was  used  in  the  2007  SAT  competition  (won  gold  in  the  random  SAT 
category). 

136.  H.  Zhang.  “SATO:  An  Efficient  Propositional  Prover.”  Proc.  of  the  74^^ 
International  Conference  on  Automated  Deduction:  272-275,  1997. 

Notes:  Describes  the  update  to  SATO  3.0  and  contains  some  results  of  testing  SATO 
3.0  against  past  versions  of  SATO  as  well  as  other  popular  satisfiability  solver 
algorithms,  such  as  DPLL,  GRASP,  etc.  Concludes  that  SATO  either  performs  best 
or  second  best  on  all  sets  of  data  considered. 

Minimum  Label  Spanning  Tree  Problem 


137.  T.  Briiggemann,  J.  Monnot,  G.J.  Woeginger.  “Local  Search  for  the  Minimum  Label 
Spanning  Tree  Problem  with  Bounded  Color  Classes.”  Operations  Research  Letters, 
31(3):  195-201,2003. 

Notes:  Discusses  the  complexity  of  the  minimum  label  spanning  tree  problem  when 
every  color  appears  at  most  r  times  in  the  input  graph.  Introduces  local  search 
algorithms  for  this  modified  problem. 

138.  R.  Cerulli,  A.  Pink,  M.  Gentili,  S.  Vofi.  “Metaheuristics  Comparison  for  the 
Minimum  Labeling  Spanning  Tree  Problem.”  The  Next  Wave  in  Computing, 
Optimization,  and  Decision  Technologies.  G.  Golden,  S.  Raghavan,  E.  Wash  (Eds.), 
Springer- Verlag:  93-106,  2005. 

Notes:  Introduces  new  metaheuristic  algorithms  for  the  minimum  label  spanning  tree 
problem.  The  metaheuristics  implemented  are  SA,  reactive  tabu  search,  the  Pilot 
method,  and  VNS.  Compares  the  new  algorithms  with  MVCA. 
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139.  R.S.  Chang,  S.J.  Leu.  “The  Minimum  Labeling  Spanning  Trees.”  Information 
Processing  Letters,  63{5):  277-282,  1997. 

Notes:  Proves  that  the  minimum  label  spanning  tree  problem  is  an  NP-complete 
problem  and  introduees  two  algorithms  for  approximating  the  solution.  This  paper 
was  the  first  to  consider  this  problem. 

140.  S.  Consoli.  “The  Development  and  Application  of  Metaheuristics  for  Problems  in 
Graph  Theory:  A  Computational  Study.”  Thesis  for  PhD  in  School  of  Information 
Systems,  Computing  and  Mathematics,  Brunei  University,  UK:  November,  2008. 

Notes:  Introduces  new  algorithms  for  the  minimum  label  spanning  tree  problem. 
These  include  GRASP,  VNS,  and  a  hybrid  local  search  method.  The  new  algorithms 
are  compared  to  MGA  (modified  genetic  algorithm)  and  the  Pilot  method. 

141.  S.  Consoli,  K.  Draby-Downman,  N.  Mladenovic,  J.A.M.  Perez.  “Greedy 
Randomized  Adaptive  Search  and  Variable  Neighbourhood  Search  for  the  Minimum 
Labelling  Spanning  Tree  Problem.”  European  Journal  of  Operational  Research,  196: 
440-449,  2009. 

Notes:  Introduces  GRASP  and  VNS  algorithms  for  the  minimum  label  spanning  tree 
problem.  Tests  the  algorithms  against  the  Pilot  algorithm  and  several  others.  Testing 
is  done  on  graphs  of  order  up  to  500  and  label  sets  of  size  up  to  625  labels. 

142.  S.O.  Krumke,  H.C.  Wirth.  “On  the  Minimum  Label  Spanning  Tree  Problem.” 
Information  Processing  Letters,  66(2):  81-85,  1998. 

Notes:  Proves  that  there  cannot  exist  a  polynomial  time  constant  factor 

approximation  for  the  minimum  label  spanning  tree  problem  unless  P  =  NP.  Tests  the 
performance  of  the  algorithms  previously  created  by  Chang  and  Leu  (the  authors  that 
first  introduced  the  problem). 

143.  J.  Nummela,  B.A.  Julstrom.  “An  Effective  Genetic  Algorithm  for  the  Minimum- 
Label  Spanning  Tree  Problem.”  Proc.  of  the  Annual  Conference  on  Genetic  and 
Evolutionary  Computation:  553-558,  2006. 

Notes:  Introduces  several  new  genetic  algorithms  for  the  minimum  label  spanning 
tree  problem.  Tests  and  compares  the  new  algorithms  against  MVCA  on  random 
graphs. 
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144.  Y.  Xiong.  “The  Minimum  Labeling  Spanning  Tree  Problem  and  Some  Variants.” 
Thesis  for  PhD  at  the  University  of  Maryland;  2005. 

Notes:  Contains  an  introduction  to  the  minimum  label  spanning  tree  problem. 
Discusses  a  particularly  difficult  class  of  graphs  for  the  MVCA  algorithm.  Introduces 
new  algorithms  for  the  problem.  Tests  and  compares  the  new  algorithm  on  random 
graphs. 

145.  Y.  Xiong,  B.  Golden,  E.  Wash.  “Improved  Heuristics  for  the  Minimum  Label 
Spanning  Tree  Problem.”  IEEE  Transactions  on  Evolutionary  Computation,  10(6), 
700-703,  2006. 

Notes:  Introduces  new  algorithms  that  are  either  modified  MVCA  or  modified 
genetic  algorithms.  Tests  the  new  algorithms  on  random  graphs  and  compares  them 
to  the  unmodified  versions  of  MVCA  and  genetic  algorithms. 

146.  Y.  Xiong,  B.  Golden,  E.  Wash.  “A  One -Parameter  Genetic  Algorithm  for  the 
Minimum  Labeling  Spanning  Tree  Problem.”  IEEE  Transactions  on  Evolutionary 
Computation,  9(1):  55-60,  2005. 

Notes:  Introduces  a  one -parameter  genetic  algorithm  for  the  minimum  label  spanning 
tree  problem.  Tests  and  compares  the  new  algorithm  to  MVCA.  Concludes  that  the 
new  algorithm  is  competitive  with  MVCA. 

147.  Y.  Xiong,  B.  Golden,  E.  Wash.  “Worst-Case  Behavior  of  the  MVCA  Heuristic  for 
the  Minimum  Labeling  Spanning  Tree  Problem.”  Operations  Research  Letters, 
33(1):  77-80,2005. 

Notes:  Analyzes  the  MVCA  algorithm  and  presents  a  new  worst-case  ratio  for  the 
algorithm.  Introduces  a  family  of  graphs  that  obtain  the  new  ratio,  proving  that  the 
ratio  cannot  be  reduced  further. 

Graph  Partitioning  Problem 


148.  R.  Banos,  C.  Gil,  J.  Ortega,  E.G.  Montoya.  “Multilevel  Heuristic  Algorithm  for 
Graph  Partitioning.”  Lecture  Notes  in  Computer  Science,  2611:  143-153,  2003. 

Notes:  Introduces  a  multilevel  algorithm  for  solving  the  graph  partitioning  problem. 
Tests  and  compares  the  new  algorithm  with  METIS,  another  multilevel  algorithm  for 
the  problem,  on  the  benchmark  graphs  maintained  by  Walshaw. 
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149.  T.N.  Bui,  B.R.  Moon.  “Genetic  Algorithm  and  Graph  Partitioning.”  IEEE 
Transactions  on  Computers,  45(7):  841-855,  July  1996. 

Notes:  Introduces  hybrid  genetic  algorithms  for  the  graph  partitioning  problem. 
Tests  and  compares  the  algorithms  against  the  multistart  KL  algorithm  and  the  SA 
algorithm  on  the  graphs  used  by  Johnson,  et  ah,  1989. 

150.  A.  Felner.  “Finding  Optimal  Solutions  to  the  Graph  Partitioning  Problem  with 
Heuristic  Search.”  Annals  of  Mathematics  and  Artificial  Intelligence,  45(3-4):  293- 
322,  Dec.  2005. 

Notes:  Formats  the  graph  partitioning  problem  as  a  search  problem  and  then  applies 
heuristic  methods  to  solve  the  problem.  The  algorithm  does  not  return  suboptimal 
solutions.  Tests  and  compares  this  approach  with  the  current  best  algorithms  on 
randomly  generated  graphs. 

151.  L.  Grady,  E.L.  Schwarts.  “Isoperimetric  Partitioning:  A  New  Algorithm  for  Graph 
Partitioning.”  SIAM  Journal  of  Scientific  Computing,  27(6):  1844-1866,  2006. 

Notes:  Introduces  a  new  algorithm  for  the  graph  partitioning  problem  based  on 
optimization  of  the  combinatorial  isoperimetric  constant.  Tests  and  compares  the 
algorithm  against  the  spectral  partitioning  method  and  METIS  on  various  classes  of 
graphs.  Concludes  that  the  algorithm  gives  slightly  higher  averages  than  the  other 
algorithms  (like  multilevel  KL). 

152.  D.S.  Johnson,  C.R.  Aragon,  L.A.  McGeoch,  C.  Schevon.  “Optimization  by 
Simulated  Annealing:  an  Experimental  Evaluation,  Part  I,  Graph  Partitioning.” 
Operations  Research,  31:  865-892,  1989. 

Notes:  Introduces  a  new  simulated  annealing  algorithm  for  the  graph  partitioning 
problem.  Compares  it  to  existing  algorithms  like  KL  and  local  optimization  methods 
by  testing  the  algorithms  on  both  standard  and  non-standard  random  graphs. 

153.  B.W.  Kernighan,  S.  Lin.  “Partitioning  Graphs.”  The  Bell  System  Technical  Journal: 
291-307,  Eeb.  1970. 

Notes:  Introduces  the  heuristic  Kernighan-Lin  algorithm.  Concludes  that  the 
algorithm  is  practical  for  solving  large  instances  of  the  graph  partitioning  problem. 
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154.  Y.H.  Kim,  B.R.  Moon.  “Lock-Gain  Based  Graph  Partitioning.”  Journal  of 
Heuristics,  10;  37-57,  2004. 

Notes;  Introduees  the  lock-gain  based  algorithm  for  the  graph  partitioning  problem. 
Uses  a  new  method  for  seleeting  vertiees  to  move  between  partition  elasses.  Tests  the 
algorithm  on  benehmark  instanees  from  other  publieations  (Johnson,  et  ah,  1989,  and 
Bui  and  Moon,  1996)  and  eompares  it  to  existing  algorithms. 

155.  R.Z.  Loureiro,  A.R.S.  Amaral.  “An  Effieient  Approaeh  for  Large  Seale  Graph 
Partitioning.”  Journal  of  Combinatorial  Optimization,  13;  289-320,2007. 

Notes;  Introduees  some  greedy  heuristie  algorithms  for  the  graph  partitioning 
problem.  Tests  and  eompares  the  algorithm  on  benehmark  instanees  from  the  graph 
partitioning  arehive  maintained  by  Walshaw. 

Graph  Databases 


141.  The  Graph  Partitioning  Arehive.  http ;//staffweb .ems. gre. ae .uk/~we06/partition/ 
(aeeessed  July  2010).  Maintained  by  Chris  Walshaw. 

Notes;  Database  with  test  sets  for  the  graph  partitioning  problem. 

142.  The  Stanford  GraphBase.  http;//www-es-facultv.stanford.edu/~uno/sgb.html 

(aeeessed  July  2010).  Maintained  by  Donald  Knuth. 

Notes;  Database  with  general  graphs  for  any  problem.  Deseribed  in; 

Knuth,  Donald  E.  “The  Stanford  GraphBase;  A  Platform  for  Combinatorial 
Algorithms.”  Proceedings  of  the  4^^  Annual  ACM-SIAM  Symposium  on  Discrete 
Algorithms,  1993;  41-43. 

143.  http ;//www. sergioeonsoli. eom/MLSTP .htm  (aeeessed  August  2009).  Maintained  by 
Sergio  Consoli. 

Notes;  Database  with  test  sets  for  the  minimum  label  spanning  tree  problem. 

144.  The  Harwell-Boeing  Colleetion.  http;//math.nist.gov/MatrixMarket/data/Harwell- 
Boeing/  (aeeessed  July  2010). 

Notes;  Database  with  test  sets  of  matriees  for  the  minimum  bandwidth  problem. 
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145.  TSPLIB.  http  ://elib  ■  zib  ■  de/pub/ mp  -testdata/  tsp/ tsplib/tsplib .  html  (accessed  July 

2010).  Maintained  by  Gerhard  Reinelt. 

Notes:  Contains  instances  for  numerous  variations  of  the  traveling  salesman  problem. 

146.  The  Graph  Database.  http://amalfi.dis.unina.it/graph/  (accessed  July  2010). 
Maintained  by  SIVALab. 

Notes:  Database  with  test  sets  of  graphs  for  the  sub-graph  isomorphism  problem. 
Described  in: 

De  Santo,  M.,  P.  Foggia,  C.  sansone,  and  M.  Vento.  “A  Large  Database  of  Graphs 
and  Its  Use  For  Benchmarking  Graph  Isomorphism  Algorithms.”  Pattern 
Recognition  Letters  24(2003):  1067-1079. 
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